








The Journal of Abnormal 
and : 


Social Psychology 
Founded by Morten Prince, 1906 


CON TENTS 


eames Samm, 


ARTICLES 
The Present State of Psychoanalytic Theory 
Merton Gill 


The Interpersonal Behavior of Childr-n in Residential Treatment 
Euvrold L. Raush, Allen T. Litimann, and Thaddcus J. Taylor 
Homogeneity of Member Personality and its Effect on Group Problem- 
L. Richard Hoffman 


The Effects of Mild Frustration on the Expression of Prejudiced Attitudes. . 
Emory L.. Cowen, Judah Lanies, and Donald EB, Schast 


Acquired and S; mbolic Affective Va! ve as Determinants of Size Estimation 
in Schizophrenic and Normal Su) jects. .... 
Theodore P. Zahn 


Mental Illness, Milieu Therapy, and Social Organization in Ward Groups. 
Edward J, Murray und Melvin Cohen 


A Comparative Study of Individual, Majority, and Group Judgment... ... 
Dean C. Barniund 


Contirued on inside front cover 


ee oe 


Volume 58 JANUARY 1959 Number 1 














‘The Effect of Attitude and Experience. Con 
rents. . Brie, eee ey ee ee 
Parshall BY. Seok Ai ety, Cm 
{atuene of Four Trp ot Data on Di Dit 


logicai ‘Pesting. . ....4.+e5. 
Williom F. Soskin x 


The Social Psychulogy of Rorschach V: Validity Research. ,. 
Leon H. Lavy und Thomas B. Grr 
Multivariable Analysis of ‘the -Conreptual Behavior of Schizophrenic and 
Brain-Damaged Patients - . Vea PON apis’ + « 84 
Denaid B. Leventhal, Lawrence # MeGoughran, and Lovis J Moron 


A General Formula ‘for the bli ae 
Walier Toatan 


Sequential Depend npc sees 
Frank Awls', Jr. and Ali¢a bs een. 


A Comparison of Mental Retarus ‘es: 
effects and Reversible Figurrs. ... i274 
Herman #, Setanta 


and Task Sint, tec ee ees Seto Sk tc, «ge Mit 
Hava Bonnt Gewitrs es aes 


5 


Social Influence on Opinions and the Communication of Related Content. . .. a : 


Berivasa H, Raven 


Crrrigve AND NOTES 


The Tet Anxiety Questionnaire: —_ 2 
Zenwil Sperber ie ake 


Donald R. Brown and LoisBllen 


Individual Versus dine Gat Cli GMS. 
Ewart E. Smith. 


Semantic Aspects of Prognosis. . 
Spiro B. Mitses 


: 
ae 
is 
‘ 
4 
: 
H 
. 


sn tasgsensipn nanan ap pcan ye eRe Fone 


spr bytieaTeaesegi 


4p ap RCI Ripe) vein sine 


rete cegned 





y 





Volume 58 


JANUARY, 1959 





The Journal of Abnormal 


and 


Social Psychology 


Editor: M. BREwsTER SmitH, New York University 


Associate Editor: E. J. SHOBEN, Jr., Teachers College, Columbia University 


CoNnsvULTING Eprrors 


Gorpon W. ALLPORT 
Harvard University 
Urre BRONFENBRENNER 
Cornell University 
ROGER Brown 
Massachusetts Institute of 
Technology 
DonaLp T. CAMPBELL 
Northwestern University 
Lavunor F. CARTER 
System Development Corpo- 
ration 
RICHARD CHRISTIE 
Columbia University 
Morton DEvTSCcH 
Bell Telephone Laboratories 
I. E. FARBER 
State University of Iowa 
DonaLp W. FISKE 
University of Chicago 
RoGeR W. HEyns 
University of Michigan 
RosBert R. Hott 
New York University 


Irvinc L. JANis 
Yale University 


Haro_p E. Jones 
University of California 

Haroip H. KEtiey 
University of Minnesota 


LEONARD S. KoGAN 
Community Service Society 
of New York 


DANIEL J. LEVINSON 
Harvard Medical School 


Davip C. McCLeLLanp 
Harvard University 


LEo PosTMAN 

University of California 
Eurot H. Ropnick 

Duke University 
Nevitt SANFORD 

University of California 
SEYMOUR SARASON 

Yale University 
RicHarp L. SOLOMON 


Harvard University 


RENATO TAGIURI 
Harvard University 


PUBLISHED BY 


THE AMERICAN PSYCHOLOGICAL ASSOCIATION, Inc. 





Number 1 


The JouRNAL OF ABNORMAL AND SOCIAL PsycHOLoGy is devoted to both abnormal 
and social psychology. Its emphasis is on basic research and theory rather than the 
techniques and arts of practice. Abnormal! psychology is broadly defined to include 
papers contributing to fundamental knowledge of the pathology, dynamics, and develop- 
ment of personality or individual behavior, including deterioration with age and dis- 
ease. Articles concerned with psychodiagnostic techniques are evaluated with respect 
to their contribution to an understanding of the psychological principles of diagnosis, 
those concerned with psychotherapy are judged in terms of their contribution to an 
understanding of the therapeutic process. Case reports are considered in terms of their 
heuristic value. Reports which serve to pose or to clarify important theoretical problems, 
or which promise to be useful in teaching, are often published. From the social area 
this JOURNAL gives preference to papers contributing to basic knowledge of inter- 


personal relations, and of group influences on the pathology, dynamics, and develop- 


ment of individual behavior. 


The JouRNAL is issued bimonthly in January, March, May, July, September, and 
November, two volumes per year. The two volumes average 832 pages. The sub- 
scription price per year is $16.00; foreign, $16.50. Single issues are $3.00. 


Original manuscripts should be sent to the Editor: 


Dr. M. BREWSTER SMITH 

Graduate Department of Psychology 
New York University 

Washington Square 

New York 3, N. Y. 


Fifty free offprints are supplied to contributors. Authors of prior publication articles 


receive no gratis offprints. 





Anruur C. Horrman, Managing Editor; Heten Orr, Promotion Manager; Barsara Cummincs, Editorial Assistant 


Subscriptions, address changes, and business communications should be sent to: 


THE AMERICAN PSYCHOLOGICAL AssocIATION, INC. 
1333 Sixteenth Street N.W. 
Washington 6, D. C. 


Changes of address must reach the Subscription Office by the 10th of the month to 
take effect the following month. Undelivered copies resulting from address changes will 
not be replaced; subscribers should notify the post office that they will guarantee sec- 
ond-class forwarding postage. Other claims for undelivered copies must be made within 
four months of publication. 

PUBLISHED BY 
THE AMERICAN PSYCHOLOGICAL ASSOCIATION, Iw 


Mt. Royal and Guilford Aves., Baltimore 2, Md 
and 1333 Sixteenth Street N.W., Washington 6, D.C 


Second-class postage paid at Baltimore, Maryland, and at additional! mailing offices 


Copyright 1959 by the American Psychologica! Association, Inc 





THE PRESENT STATE OF PSYCHOANALYTIC THEORY’! 


MERTON GILL 


Berkeley, 


SYCHOANALYTIC theory is often regarded 
by its critics as a rigid Procrustean 
system. In fact, it is continually chang- 
ing and growing, although its changes are 
principally those of widening perspectives, 
modifications, and additions, rather than 
alterations in basic assumptions. The past 
decade has seen a serious effort to make the 
system of psychoanalytic theory more ex- 
plicit and internally consistent. Though many 
people contributed to this effort, I would 
single out for special mention the work, singly 
and together, of Hartmann, Kris, and Loewen- 
stein (13, 14, 15, 16, 18, 19, 20, 27, 28, 29), of 
Rapaport (32, 33, 34, 35, 36, 37, 38, 39), and 
of Edith Jacobson (23, 24). But it would be 
incorrect to regard these efforts as simply 
tidying up the theory. Basic assumptions have 
been questioned and alternative assumptions 
have been proposed. A recent paper by Jacob- 
son (24), for example, seriously questions the 
validity and usefulness of such basic psycho- 
analytic propositions as primary narcissism 
and primary masochism. Of course, even 
more fundamental propositions than these 
such as the importance of infantile psycho- 
sexuality—have been questioned and dis- 
carded by some psychoanalysts (31). The 
main stream of psychoanalysis, however, 
holds fast to a number of basic assumptions 
while adding and modifying others. It is this 
main stream of the psychoanalytic theory 
that I will discuss here, though there will be 
occasional reference to the assumptions of the 
revisions of psychoanalytic theory. I believe 
that the ferment in the main stream of psycho- 
analytic theory is a refutation of the charge 
that it constitutes an orthodoxy. Any dis- 
cipline has its share of those who clutch to it 
as to a religion, and these are not always its 
strongest members. I will discuss psychoana- 
lytic theory, not psychoanalytic practice. The 
lag between the two is sometimes great and 
it may be either theory or practice. which 
forges ahead of the other. It has been gener- 
! Revised from a paper delivered at the AAAS meet- 
ings in Berkeley, California, December 1954. 
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ally true in psychoanalysis that the theory of 
technique lags behind both basic theory and 
practice. 
CHANGES IN PSYCHOANALYTIC 
THEORY 


IMPORTANT 


My plan here is to present a number of the 
basic assumptions of psychoanalytic theory 
and to indicate what I regard as the important 
changes in the theory pertaining to each of 
these assumptions in the last 20 or 25 years. 


1. The Psychoanalytic Theory of Motivation 

Psychoanalysis always has been and con- 
tinues to be a theory that centers on motiva- 
tion in human behavior. Uniquely characteris- 
tic of psychoanalysis is the kind of motivation 
it postulates: drives rooted in the biology of 
the organism. These drives are sexual—in 
the broader sense in which the word is em- 
ployed in psychoanalysis—and aggressive. 
They are characterized by their urgency, 
their intimate connection with various kinds 
of bodily behavior, both in terms of one’s 
own body and the bodies of other people, 
and by the rather bizarre quality of their mode 
of function when viewed in the light of or- 
dinary conscious motivation. 

Psychoanalysis has not given up this con- 
ception of primitive drives, but it has some- 
what changed its view of their place in per- 
sonality functioning and has added to its 
theory to account for other kinds of motivation 
as well. Whereas the psychoanalytic theory 
of motivation was formerly restricted almost 
entirely to primitive drive, now it includes a 
complex hierarchy of motivations that implies 
a progressive taming of drives with advancing 
development and progressive infusion of the 
drive representations with cognitive elements 
reflecting external reality (34). Its view of 
the dynamic relationships of the various levels 
of the motivational hierarchy has so changed 
as to increase the emphasis on derivative mo- 
tivations—to which it has always paid some 
attention—though by no means diminishing 
the emphasis on more primitive motives. 
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The very term “derivative motivation” 


implies the earlier psychoanalytic concept of 


the relationship of various motivational levels. 
The primitive drives as such were considered 
to be even in derivative motivations, 
which were regarded as reducible to drives 
at the primitive level. The surgeon was still 
considered to be his primitive 
sadistic impulse, the bibliophile to be express- 
ing his anal wishes. The concept which marks a 
drastic shift from this point of view is that of 
secondary autonomy (13). Its implications 
are similar to that of Allport’s “functional 
autonomy” (1) and it conveys the conclusion 
that derivative motivations develop a semi- 
autonomy, and can be triggered relatively 
independently action. There remains 
much controversy and unclarity, however, 
as to the roles played in normal behavior by 
drives pertaining to the different levels of 


active 


expressing 


into 


the motivational hierarchy. 

Parallel to the recognition of a motivational 
hierarchy with degrees of autonomy, there 
have been developed new concepts of the kind 
of energy pertaining to motivations. The 
energy of the primitive drives is libidinal or 
aggressive but as motivations become pro- 
gressively autonomous their energy becomes 
progressively more “neutralized.” Such neu- 
tralization is said to occur by “delibidiniza- 
tion” and “‘deaggressivization” (15). 

The overthrow of primitive reductionism 
goes even further, however, than merely to 
assert that derivative motivations develop a 
relative autonomy. While formerly all be- 
havior was reduced to primitive drive motiva- 
tion, now behavior is considered to be deter- 
mined not only by motivational factors but 
by other (structural) factors too, which enter 
into the causal determination of behavior as 
independent variables and hence are not re- 
ducible to motivations, whether primitive or 
derived 
2. The Independent Variables Other Than Drive 


Despite the fact that from the very begin- 
nings of psychoanalysis concepts other than 
motivational both in theoretical 
and clinical studies, the prevailing intent of 
the theory was to reduce every behavior to 


were used 


motivational terms. A sharp change has come 


about in that now independent variables 


other than motivation are also assumed in 
the determination of behavior. These inde- 
pendent variables are of two main sorts. 

The first are the factors which are concep- 
tualized as primarily autonomous, that is to 
say, autonomous from drive (13). Whereas 
secondary autonomy refers to derivatives of 
drives, primarily autonomous functions arise 
independently of drives.? These latter are also 
conceptualized as the apparatuses of the ego 
and include among others perception, thought, 
memory, concept formation, and discharge 
thresholds (34). This is not to say that in any 
actual perceiving, thinking, remembering, or 
concept formation, drive factors play no role 
but rather that independent factors not 
derivable from drives do play a role in each 
of these events. 

It will be noted that this class of independent 
variables comprises intraorganismic capacities 
and structures and that some of them would 
be called cognitive in general psychological 
theory. That psychoanalytic theory regards 
these variables as truly independent is attested 
to by the relationships conceived between 
the two kinds of variables—drive and ego 
apparatus. They are considered to reciprocally 
influence each other (17) and efforts are made 
to study both how the development of drive 
is influenced by the ego apparatuses and how 
the development of the apparatuses is in- 
fluenced by drive. It should be noted that the 
ego apparatuses are only a part of the ego, 
and the important question has been raised 
in psychoanalytic theory whether other aspects 
of ego functioning may not also be primarily 
autonomous in their origin—as for example, a 
predilection for the use of a particular variety 
of defense mechanism (15). 

The second class of factors entering into 
the determination of behavior as independent 
variables are extraorganismic—those derived 
from the external environment. These may be 
classified in various ways—the most usual 
being a tripartite one into the physical en- 
vironment, the interpersonal environment, 
and the social environment. 

Naturally both psychoanalytic theory and 
clinical studies have always dealt with en- 

? Whether primarily autonomous functions have 
available primary neutral energy—that is, neutral 
energy not derived from libidinal or aggressive energy 
remains a moot question (16). 
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vironmental factors, especially the interper- 
sonal ones. The entire field of the development 
of object relationships—that is to say, rela- 
tionships with other people—comes under this 
heading. It is true, however, that the explicit 
theoretical accounting for the role of the or- 
ganized social environment is relatively recent, 
and it is true even that only relatively re- 
cently has the effort begun to encompass 
object relationships systematically within the 
psychoanalytic theory (27). I realize that this 
account may be challenged. I am aware that 
object relationships have long been extensively 
dealt with in psychoanalytic writings, but I 
believe that only recently have efforts been 
made to find a place in systematic theory for 
the role played by other people in the de- 
velopment of object relationships. In fact I 
think this lag in psychoanalytic theory proper 
is one of the reasons that Sullivan’s theory (41) 
has been so favorably received. It made an 
important .contribution in its emphasis on 
interpersonal relationships but this emphasis 
became so all consuming as to jettison the 
psychoanalytic theory of motivation, especially 
of primitive drive. The same may be said for 
Horney (22) and with the addition of an 
emphasis on social factors for Fromm (12). 

The problem of the relative roles played by 
and the relationships between intrapsychic, 
interpersonal, and social factors is, of course, 
subject to much controversy. It remains true 
by and large that the heaviest stress in psy- 
choanalytic theory (insofar as it is to be dis- 
tinguished from a_ general psychological 
theory) will always be on the intrapsychic 
factors. The outstanding demonstration that 
there nevertheless is room for social factors 
within the framework of the established psy- 
choanalytic theory proper is Erikson’s work 
(3, 4, 5). 


3. Maturation in Psychoanalytic Theory 


I will turn now to question whether or not 
these changes have resulted in any alteration 
in another one of the basic pillars of psycho- 
analytic theory, its heavy emphasis on mat- 
urational factors in personality development. 
Psychoanalysis continues to emphasize inborn 
maturational sequences, but in addition to 
such sequences in the development of drives, 
they are now postulated for the development 
of the ego as well (15). Whatever disputes 
there may be about the accuracy of the famil- 


iar oral-anal-phallic-genital sequence, or the 
unsettled state of the sequence of stages in 
ego maturation, psychoanalytic thecry con- 
tinues to put much weight on such matura- 
tional squences. It is true, however, that with 
the “new regard for the environment” (27), 
learning has come to occupy a more central 
role in the psychoanalytic theory of develop- 
ment. There is little agreement in psycho- 
analytic, just as in general psychological, 
theory as to the relative roles to be ascribed 
to learning as against maturation. A tem- 
porary re-emphasis of maturation as against 
learning has taken place in some areas of 
psychoanalytic theory as a reaction against 
the postulation of highly complex very early 
object relationships—as early as the first 
months of life—by the school around Klein 
(26). Psychoanalysis has no well-defined and 
explicit learning theory (21). Yet it seems clear 
that psychdanalytic theory is incompatible 
with S-R learning theory and the discussion 
of its possible relationship to a cognitive or a 
motivational learning theory like Tolman’s 
(42) is beyond the scope of this paper. It 
may nevertheless be mentioned, as pointed 
out by Rapaport (35), that Hartmann’s con- 
cept of automatization (13) is the psychvana- 
lytic equivalent of habit. Behavior according to 
this concept;is originally learned in a complex 
motivational context, but may become struc- 
turalized, semiautonomous, triggered by exter- 
nal stimuli, and automatized or, in Lewinian 
(30) terminology, ‘“ossified.” The expla- 
nation of the development of the motiva- 
tional life and of interpersonal relationships 
by maturational rather than learning processes 
continues to be much more central to psycho- 
analysis than to general psychology. 
4. Psychoanalysis as a Genetic Theory 

From a discussion of maturation it is natural 
to turn to another pillar of psychoanaly- 
tic theory, its emphasis on longitudinal fac- 
tors—the genetic point of view. Psychoanaly- 
tic theory continues to be heavily genetic in 
its emphasis. While it is, of course, the case 
that genetic factors are determinative of cur- 
rent behavior only insofar as they have 
shaped the form of currently active elements of 
psychic functioning, it is nevertheless true 
that psychoanalytic theory holds that many 
behavioral manifestations become understand- 
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able only in terms of their genesis. I should 
mention here two recent major assumptions 
in psychoanalytic theory that pertain to what 
we might call general laws of genesis. The first 
is that while psychoanalytic theory has long 
held that the phase s of ego developme nt occur 
in close connection with the correlated phases 
of libidinal development, it is now proposed 
that the vicissitudes of aggressive drives and 
of independent factors in the ego must 
be taken into account. An example is Hart- 


also 
mann’s proposal that “if the defensive reac- 
tion against danger from within is modelled 
after the one to danger from without, it is 
possible that there too [in the defense mecha- 
nisms] the use of—in this case more or less 
neutralized—aggressive energy is more regular 
than the libido”? (15). 
The second major suggestion is that primitive 


use of desexualized 
id functions may be “taken over” by the ego 
and form the basis for quite different functions 
or different uses of the same function. The 
defensive functions of the ego again offer the 
ready examples. As Hartmann (15) 
probably exists as a 


most 
says, “introjection... 
form of instinct gratification before it is used 
.. The ego Can use, 
the primary 
. Freud [10] 
has drawn a parallel between the mechanism 
of isolation and the normal process of atten- 


in the service of defense. . 
for defense, characteristics of 


process, as in displacement. . 


tion.” A proper understanding of such an 
altered function would necessarily involve a 
knowledge of its genesis. The importance of 
genetic concepts becomes especially obvious 
partial—though to be sure highly 

return to earlier forms of function 


in the 

modified 
in regressive states. In regression there is to a 
relative 
neutralized 


greater or lesser extent a loss of 


autonomy and a 
energy to more aggressive and libidinal forms. 


reversion of 


5. Primary and Secondary Processes in Psy- 
choanalysts 


An exceedingly foundation of 
psychoanalytic theory 
Freud’s greatest achievement 
yet little regarded by general psychology is 
the description of two modes of functioning 
of the psychic apparatus: the primary and 
secondary processes (8). The primary proc- 
ess abides by the pleasure principle, and 
displacement, 


important 
regarded by some as 


and one as 


employs the mechanisms of 


condensation, and symbolization. It is a 
kind of short circuiting of gratification, dis- 
regarding the laws of logic and operating with 
little reference to the nature of external real- 
ity. The secondary process is a mode of psy- 
chic functioning that abides by the reality 
principle, by the laws of logic, and takes into 
account the nature of real external reality. 
One of the important developments in psycho- 
analytic theory is that these two modes of 
psychic functioning are no longer regarded 
as a dichotomy and as mutually exclusive. 
They are now conceived as ideal poles of a 
continuum (28), and an important new con- 
cept, proposed by Kris (29), gives account of 
the fact of the adaptive use of primary proc- 
ess functioning by the ego, namely: “‘regres- 
sion in the service of the ego.” 

It may be worth pointing out here that the 
shift in the psychoanalytic view of motivation 
described earlier can from another point of 
view be described as a shift from emphasis on 
primary process functioning to secondary 
process functioning. The mode of operation of 
primitive drives in the immature organism is 
largely by primary process mechanisms. In 
fact it was an unwarranted generalization of 
this revolutionary discovery that initially 
led psychoanalysis to subordinate all else 
to primitive motivational factors. The recog- 
nition of derivative motivations and of inde- 
pendent variables other than drives operating 
in the determination of behavior—more spe- 
cifically independent variables relating to the 
nature of the external world—is a shift to the 
view that much of normal functioning is 
secondary process functioning. 


6. The Unconscious in Psychoanalysis 


Psychoanalysis continues to insist on the 
overwhelming importance of unconscious 


processes in psychological life. It must be 
pointed out, however, that the significance of 


unconscious factors is closely related to the 
psychoanalytic theory of motivation, and 
alteration in the motivational view will have 
serious repercussions on the theory of the 
role of the unconscious. If a good deal of nor- 
mal behavior is ascribed to derivative semi- 
autonomous motivations which can be con- 
scious, by just that much is unconscious mo- 
tivation shorn of its significance in normal 
behavior. Yet there has not been any real 
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diminution in psychoanalysis of emphasis on 
unconscious motivation. In my opinion, this 
seeming paradox may be at least partially 
resolved by the view that the semiautonomous 
motivations are indeed only semiautonomous 
and that any particular behavior may be 
viewed in a “‘nest’’ of motivational contexts of 
increasing generality. It is difficult to describe 
this conception in just a few words, but it 
argues that a semiautonomous conscious 
motive operates in a wider context as a means 
in relation to a less conscious and more 
primitive motivation, which in turn operates 
in a wider context as a means in relation to an 
even more primitive and unconscious motiva- 
tion. I would like to repeat that the role that a 
theoretical system is likely to ascribe to un- 
conscious processes will be intimately linked 
with its view of the motivational structure of 
personality. 


7. The Introduction of the Structural Point of 

View 

I can summarize a number of the important 
changes I have sketched in psychoanalytic 
theory by uniting them under the heading: 
the introduction of the structural point of 
view. It is often said that the structural point 
of view was introduced into psychoanalysis 
in 1923 with the publication of Freud’s work 
The Ego and the Id (9). In some ways this 
statement is true but in others false. It over- 
states the case in that structural considerations 
are both implicit and explicit in psychoana- 
lytic theory from the very beginning. The con- 
cept ‘fopographic rather than structural was 
first used and the topographic divisions were 
made in the dimension of consciousness 
the unconscious, the preconscious, and the 
conscious. Under the impact of the realiza- 
tion that both that which was repressed and 
the repressing forces too could be unconscious, 
Freud (9) was led to hypothesize the structures 
id and ego, and later the topographic divisions 
of consciousness, hitherto called “systems,”’ 
became regarded as only qualities rather than 
coherent systems (11). As long ago as in the 
seventh chapter of The Interpretation of 
Dreams, consciousness itself was regarded as 


a supraordinate apparatus. In present theory, 


this apparatus is ascribed to the ego, yet this 


early concept of consciousness remains the 
most sophisticated psychoanalytic view of 


its functioning (38). The statement is further 
false in the sense that the id-ego-superego 
trichotomy is only a very gross statement of 
the structural point of view and actually gave 
names to structures which had, though not 
systematically, in part been already marked 
out. Only in the last several decades has the 
structural point of view been developed in a 
thorough and systematic manner. 

It is not easy to give a clear yet brief picture 
of this point of view. A full description of a 
mental process—a metapsychological descrip- 
tion—requires its discussion from three points 
of view, dynamic, economic, and structural. 
Of these points of view, the dynamic refers to 
the interplay of forces, the economic refers to 
energy considerations and to the restoration 
of the homeostatic equilibrium, while the 
structural refers to that aspect of the mental 
process which characterizes its place in the 
enduring, stable organization of the mind 
(39). In roughest form, a process is charac- 
terized structurally as to its position in one of 
the three mental structures—id, ego, or 
superego. Far more subtle dissection of the 
structure of mind has been carried out as 
far as ego functioning is concerned. Structural 
considerations within one of the three major 
structures are called intrasystemic (15). The 
description of the ego, for example, as an or- 
ganization of semiautonomous _ behavioral 
dispositions is a structural statement. Early 
psychoanalytic theory may be characterized 
as dynamic and economic. If the early psycho- 
analytic view of motivation had been taken 
seriously, the most ordinary item of behavior 
could be conceived as occurring only as a 
result of the interplay of powerful forces 
what Rapaport called ‘a battle of Titans’ 
(34). The introduction of the structural point 
of view makes possible a view of personality 


, 


functioning which includes its steady, stable, 
ordinary, organized, enduring patterns of be- 
havior and thinking. 

These advances in the structural conception 
are another way of stating the change in the 
psychoanalytic view of motivation, of energy 
pertaining to motivation, and even of the rec- 
ognition of independent variables other than 
motivation determining behavior, since we 
may restate these propositions as: derivative 
motivations arise from relatively autonomous 
(structuralized) apparatuses employing neu- 
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tralized energy; the primary autonomous 
apparatuses are ego structures.’ The ego as a 
whole is itself conceptualized as relatively 
autonomous, from the id on the one hand and 
the environment on the other (37). 
ALTERNATIVE FORMULATIONS OF (‘TRENDS 
IN PSYCHOANALYTIC THEORY 


I have now stated in brief the major as- 
sumptions of and major changes in psycho- 
analytic theory. To achieve further clarifica- 
tion I will now describe various other ways in 
which these same changes have been stated. 

One general way of stating the change is 
that psychoanalysis has moved into the area 
of ego psychology in comparison with its for- 
mer preoccupation with id psychology. Ego 
psychology includes considerations of the 
autonomous apparatuses, the hierarchy of 
motivations which are the systems of dis- 
position to behavior as we find them in the 
ego, and the extra-organismic variables which 
enter into the determination of behavior by 
way of the cognitive functions of the ego. 

Another way of stating the change is to say 


that psychoanalysis is now not only a psychol- 
ogy of the depths, but of the surface too (6). 
The psychological surface is the ego, and it is 
the ego which is in contact with the external 


real world. 

Yet another way of saying the same thing is 
that psychoanalysis has included adaptation 
among its basic concepts (13), whereas for- 
merly it was preoccupied with intrapsychic 
processes. Adaptation implies an ego in contact 
with the real external world and the determina- 
tion of behavior by variables other than moti- 
vational. The same principle has been ex- 
pressed as the increased attention paid by 
psychoanalysis to the environment (27). I 
have made this point in another way in my 
discussion of the primary and secondary 
processes. The secondary process abides by 
the reality principle, which is the principle 
of adaptation. The most general way to state 
the change: psychoanalysis is becoming a 
complete psychological system, embracing all 
of human psychological functioning (13). 
While psychoanalysis was principally con- 

3 Whether derivative motivations “arise” in a hier- 
archy from hierarchically ordered means-structures or 
whether the motivations themselves must be considered 
structures requires further clarification. 


cerned with psychopathology, large areas of 
personality functioning remained unaccounted 
for by its theory. As a total psychological 
system, psychoanalysis had to complement 
its motivational considerations with cognitive 
and adaptive ones. The social structure, for 
example, which at least in some ways plays 
a more obvious though perhaps no more 
prominent role in “normal” than in “neurotic” 
function begins to loom larger in present 
day psychoanalytic theory. 


CHANGING RELATIONSHIPS WITH PSYCHOLOGY 


Now a few comments on the changing rela- 
tionships between psychoanalysis and aca- 
demic psychology. I have said that psycho- 
analysis aspires towards becoming a total 
psychological system by including cognitive 
and adaptive considerations as well as motiva- 
tional. It will be well to stress that in part the 
stimulus for this move comes from academic 
psychology, though references in psychoana- 
lytic theory to the contributions of academic 
psychologists are few and far between. It is 
obvious, though, that cognitive and adaptive 
considerations have been the special provinces 
of academic psychology, and just as psycho- 
analysis has moved to include cognitive func- 
tions and to recognize more explicitly adaptive 
considerations so has psychology moved to a 
greater emphasis on motivation. The so- 
called “new look” (2) in research on percep- 
tion—that is, cognition 
study of perception in a motivational frame- 
work. There is a good deal of thought in 
general psychology about the relationship of 
the cognitive and motivational factors. For 
instance, Klein and Krech (25) consider the 
dichotomy of cognition and motivation mis- 
leading and unnecessary and regard all be- 
havior as consisting of unities which involve 
cognition and motivation indivisibly. I agree 
with this point of view and have separated 
motivation and cognition in this paper only 
for convenience of exposition. 

It must not be presumed, however, that 
psychoanalysis and psychology are now merg- 
ing or becoming indistinguishable. Indeed, 
many people will find it ridiculous that I 
sound this warning. Nevertheless, it is worth 
pointing out that once propositions from these 
perspectives are stated on a sufficiently high 
level of abstraction, vital differences begin 
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PRESENT STATE OF PSYCHOANALYTIC THEORY 


to be washed away. This is most strikingly 
true with regard to the problem of motivation. 
On the one hand, both psychoanalysis and 
psychology stress motivation and both accept 
the ideas of primary, derived, and relatively 
autonomous motivations; on the other, the 
specific picture of the motivational structure 
of man is vastly different in the two. 

Psychoanalysis still adheres to a view of 
motivation as built on drives rooted in the 
biology of the organism. However much de- 
rived motivations are recognized, psycho- 
analytic theory still views behavior as es- 
sentially motivated by and occurring in the 
context of bodily drives. Drives for security, 
success, prestige, status, seem, relative to 
these basic drives, “superficial” to the psy- 
choanalyst. He thinks rather of castration 
fear, oedipal wishes, cannibalistic impulses, 
homosexuality, or a drive to rend and destroy. 
It is often said that the analyst’s preoccupa- 
tion with these primitive impulses stems from 
his almost exclusive concern with disturbed 
personalities. But the analyst believes that 
the normal person, too, is occupied with dealing 
with such impulses. The difference between 
the normal and the neurotic he sees not in the 
root of the tree of the motivational hierarchy 
but much closer to its crown. 

A word about the libido theory which is 
tenaciously defended by psychoanalysts be- 
cause it represents for them the view of motiva- 
tion described in this paper as classically and 
specifically psychoanalytic. In some sense this 
is unfortunate, because as a consequence 
criticism of the libido theory and demonstra- 
tion of errors in it are viewed by some groups 
of analysts as well as by psychologists as an 
overthrow of the entire psychoanalytic theory 
of motivation. I would like to suggest that 
what is really being defended by analysts is 
not the libido theory as such but the over- 
whelming importance of relatively primitive 
bodily drives in motivating behavior. Discus- 
sions of the libido theory (40) in my opinion 
would be more fruitful if it were not regarded 
as equivalent to the more general psycho- 
analytic theory of motivation. The place of 
instinctual libidinal impulses in personality 
functioning does not stand or fall with the 
libido theory per se. 

Psychologists are also beginning to talk 
more frequently of unconscious processes in 
human beings. But here, too, the apparent 


rapprochement should be taken with a grain 
of salt. It is easy to employ the concept of 
unconscious processes without taking them 
seriously, while they are taken very seriously 
indeed in psychoanalytic theory and practice. 
Behavior, as described earlier in this paper, 
is regarded by psychoanalysis as essentially 
motivated by unconscious processes. To some 
extent the difference between the psychoana- 
lyst and the psychologist is the molar-molecu- 
lar problem. The psychologist is ordinarily 
concerned with small segments of behavior 
such as can be observed and controlled in the 
laboratory, and which do not readily show 
the presence of unconscious forces in the psy- 
choanalyst’s sense. The psychoanalyst on the 
other hand is more concerned with “real life’ 
behavior and major trends and crises. Some 
integration of the two points of view may be 
brought about by what I described earlier as 
the possibility of viewing any particular be- 
havior as occurring in a “nest’’ of motivational 
contexts of increasing generality. 

There is much work in contemporary clin- 
ical psychology which deals with fairly direct 
derivatives of unconscious processes; for ex- 
ample, in some types of Rorschach responses. 
But these must not be confused with the un- 
conscious processes themselves; and just as 
important is the fact that although the psy- 
chologist may recognize these as derivatives 
of unconscious processes, this should not be 
taken to mean that the person who has pro- 
duced these responses is thereby brought 
any closer to an awareness of, or insight into, 
the contents of his unconscious. 


SUMMARY 


To summarize briefly: I have described 
some of the basic assumptions of psycho- 
analytic theory and the recent changes which 
have taken place in them under the headings: 
1. The motivational theory; 2. the introduc- 
tion of independent variables other than 
drives; 3. emphasis on maturation; 4. the 
genetic emphasis; 5. the primary and second- 
ary processes; 6. the unconscious; and 7. the 
introduction of the structural point of view. I 
then indicated that these changes in psycho- 
analytic theory are often described as psy- 
choanalysis becoming concerned with: (a) ego 
psychology; (6) the psychology of the surface; 
(c) increased attention to the environment; 
and (d) the introduction of cognitive and 
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adaptive considerations. I closed with a few 
general comments on the relationship of psy- 
choanalysis and psychology 
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THE INTERPERSONAL BEHAVIOR OF CHILDREN IN 
RESIDENTIAL TREATMENT! 


HAROLD L. RAUSH, ALLEN T. DITTMANN, anp THADDEUS J. TAYLOR 
National Institute of Mental Health 


HIS REPORT is concerned with the inter- 

personal behavior of a small group of 

disturbed children and with changes in 
their behavior over a period of a year and a half 
in a residential treatment program. In general, 
the study was exploratory, oriented toward 
description and a search for order in complex 
behavioral events, toward the evaluation of 
behavioral change in treatment, and toward 
the evaluation of a method for observing and 
coding interpersonal behavior. 


Toward Description 

The source of data was spontaneous be- 
havior as it appeared in the daily activities of 
six “hyperaggressive” or “acting-out” boys. 
To some extent, then, the paper presents a 
naturalistic study of the social behavior of a 
small group of children. Studies of small 


groups have mostly centered on the social 


interactions of adults in task-oriented situa- 
tions (Bales, 1950; Hare, Borgatta, & Bales, 
1955), whereas interest in the,social behavior of 
children has generally focused on-+ specific 
variables in relation to more or less specific 
hypotheses (Baldwin, 1955). The work of 
Barker and Wright (1954) on the ecology of 
children’s behavior in a small Midwestern 
town is an exception to the trend. The methods 
of the present study differ from those of 
Barker and Wright—a difference in part 
guided by doubts about the usefulness of a 
level of approach that is so strictly phe- 
nomenological. But there is a major shared 
value orientation: that it would be useful to 
have objective and manageable descriptions 
of the quality, frequency, and intensity of 
everyday behavior for all sorts of groups and 
for all sorts of situations. Such a statement is 


1 Among the people, other than the authors, who 
have been involved in this study at one time or another, 
special note should be made of the.contributions of 
Donald S. Boomer and D. Wells Goodrich. They were 
not only contributors to the observation and coding 
processes, but the procedures and methodology used 
here owe much to them. Fritz Redl, Chief of the Child 
Research Branch, and the Child Care Staff made the 
study possible by their cooperation 


not an exhortation to random empiricism. 
Description is not necessarily atheoretical, and 
an interest in descriptive patterns of everyday 
behavior is not necessarily antithetical to the 
testing of hypotheses about specific variables. 
To some extent, there is, however, a question of 
order. Anthropologists, for example, have 
learned that one cannot legitimately interpret 
specific actions in isolation from other actions 
or in isolation from patterns within a culture, 
and they have learned how misleading isolated 
hypotheses may prove when contextual in- 
formation is lacking. Adequate psychological 
descriptions of the everyday behavior of 
various subgroups, theoretically or pragmati- 
cally defined, would constitute a step toward 
the formulation of specific hypotheses. Such 
descriptions would also constitute a step 
toward a general ethology of human behavior. 

The group under study was small, and it 
was a rather special one in the sense that it was 
selected for a particular syndrome to which 
considerable social interest accrues in relation 
to delinquency and which is recognized as 
difficult to treat clinically. As a group, it was 
not task-oriented, nor had it come together 
through the common interests of its members. 
The environment was also a special one—even 
if only because psychiatric institutions are not 
typical of children’s living arrangements. Be- 
cause of the selection process, the children were 
rather homogeneous, and they lived together 
under the protection, supervision, and control 
of a professional staff in the rather homogene- 
ous environment of the institution. The 
limitations of the present study are, however, 
perhaps not so much in the special char- 
acteristics of the subjects or of their environ- 
ment, but rather in the lack of comparative 
data. 


Toward Evaluation of Change 


The second aim is toward the evaluation of 
treatment change, not as measured by ratings 
or by tests, but rather as manifested in the 
interactions of everyday life. It was, of course, 
expected that the children would show some 








10 H. L. Rausu, A. T. DitTMANN AND T. J. TAYLOR 


improvement in their overt interpersonal rela- 
tions. Findings of improvement would be 
consistent with clinical impressions about most 
of the children. But long-term clinical work 
has its hazards, one of which is that changes, if 
and when they occur, are often so gradual that 
an awareness of the similarities and contrasts 
between the then and the now is easily lost or 
distorted. A more formal method would 
exercise a discipline on both clinical wishes 
and clinical doubts, while perhaps also pointing 
to areas and problems which were less clinically 
obvious. 

Clearly, the study is not definitive in relation 
to treatment of the “acting-out” child or in 
relation to any of the specific variables that 
enter into the residential treatment situation. 
Critical tests of treatment benefits can most 
relevantly be made outside the framework of a 
therapeutically focused environment; critical 
tests would involve variables in addition to 
those overt aspects of interpersonal behavior 
which are investigated here; and finally, 
critical tests would involve adequate controls. 
What can be fruitfully investigated is whether 
a group of children change systematically in 
their ordinary interactions. Given evidence of 
such changes, examination may be made of 
their relevance to treatment aims. From the 
viewpoint of therapeutic concerns, the pos- 
sibilities of favorable modification in the 
behavior of a clinically difficult subgroup might 
be shown, and perhaps the clinical worker may 
be encouraged in his efforts. 


Toward Evaluation of the Method 


At the least, a method for evaluating inter- 
personal behavior and behavioral change was 
studied. The method involved multiple ob- 
servations of children in naturalistic settings 
and the coding of these observations by a 
scheme which was originally devised for study- 
ing the group behavior of adults (Freedman, 
Leary, Ossorio, & Coffey, 1951; Leary, 1957). 
A partial proof of the method lies in its success 
or failure in demonstrating expected phe- 
nomena. That is, one aspect of the study is 
related to a question of construct validity 
(Cronbach & Meehl, 1955). To the extent that 
the approach achieves the differentiations that 
might reasonably be expected of it, the method 


offers promise for investigations of other 


groups in other environments. 
METHOD 
The Children and the Institution 


The six boys were the total patient population of a 
hospital ward which they entered when they were from 
8 to about 10 years old. There was—and still is—no 
diagnostic category into which they fit easily. In gen- 
eral, their pathology and their actions were beyond 
the realm of the typical childhood neuroses, yet at the 
same time they did not represent childhood psychoses. 
The behavior of the children was characterized by such 
overwhelming aggressiveness that they could not be 
tolerated by community, schools, foster parents, or 
parents. Four boys had been referred to courts for 
destructive behavior, and three of these had been sent 
to a reformatory shortly before their admission. The 
two children who had not come to the attention of the 
courts had been excluded from several schools because 
of their antisocial actions. There was usually a history 
of multiple contacts with social agencies, but outpatient 
clinical treatment and special school programs seemed 
to be of little use, at least in the long run, in these cases. 
None of the boys were “gang” delinquents. Their 
problems seemed rather a function of intense personal- 
ity disturbances with a marked deficiency in ego con- 
trols, particularly where aggression was concerned. 
They were children such as Red] and Wineman (1957) 
have described, children often called hyperaggressive 
or “acting-out,”’ although neither of these terms is 
idea]. All six were physically healthy, and so far as 
could be judged from psychological and psychiatric 
examinations, they were of normal intelligence and 
showed no evidence of gross brain damage. They 
came from socioeconomic milieux ranging from lower- 
lower through lower-middle class, with two children 
from lower-middle class homes and the rest lower in 
socioeconomic status. 

Throughout the time of these investigations, the 
children lived on the ward. Their program was planned 
intensively and minutely. All were seen approximately 
four hours weekly in psychotherapy; their schooling, 
which took place adjacent to the ward, was with spe- 
cially trained and experienced teachers and clinicians; 
ward programing and the handling of clinical problems 
of daily living were closely planned and organized, 
again by people with considerable experience with 
disturbed children. Most of all, considerable time and 
effort were devoted to the coordination and integration 
of the various levels of treatment. This brief descrip- 
tion, though markedly oversimplified, is relevant for 
interpreting the results of the study. 

The present report examines the interpersonal be- 
havior of the children at two phases. When the initial 
series of observations was made, two of the children 
had been at the institution between three and four 
months, and the other four had been there nine or ten 
months. One may suppose, then, that they knew each 
other fairly well and that they were familiar with the 
general pattern of living within the ward. Their ages, 
at the start of the study, ranged from 8 years, 1i 
months, to 10 years, 11 months, with the median age 
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at 10 years. The second series of observations was 
made some 18 months later.* 


The Observations 


In each phase of the study, each of the six children 
was observed twice in each of six settings. There were, 
thus, a total of 144 observations, 72 in each phase. A 
number of observers were involved, but each single 
observation was made by a single observer.? The ob- 
server would go over to the ward or to the gym or to 
the outdoor play area, for example, to observe a par- 
ticular child according to his assignment of children 
and settings. He would concentrate on that particular 
child, trying to follow as much as was possible the 
transactions between this child and other children or 
adults. But he would also try to note what went on 
among the other children within the locus of observa- 
tion. The observer did not take notes; he would gauge 
the length of his observation to the amount of the 
specific activity that he could remember. This meant 
that the time period of observation was highly variable. 
If there was, for example, an extremely rapid interplay 
of behavior, the observation time might be as brief 
as several minutes; on the other hand, it might extend, 
if the interchanges were few and far between, to as 
much as 20 minutes or a half hour. The mean time for 
an observation was about 8 minutes. An analysis of 
variance indicated that differences in observation time 
allotted to either individual children or to different 
settings were not significant. 

After leaving the setting, the observer would ‘mme- 
diately dictate onto tapes or Audograph discs a factual, 
descriptive report of his observation. Cbservers were 
cautioned to focus on interaction, to be as specific and 
concrete as possible, and to avoid psychological terms 
and inferential conclusions. In general, observer train- 
ing required only several practice trials.‘ The protocols 
of the observations resemble those obtained by Barker 
and Wright (1954), though their descriptions were 
undoubtedly more fully detailed. 


The Settings 


Different situations—for example, even different 
games—exert a pull for different behaviors, and they 
differ in the kinds of behavior they sanction, positively 
encourage, or inhibit (Red) & Wineman, 1957). The 
opportunity rarely exists to study behavior in a repre- 
sentative variety of settings, to relate behavior to what 
Brunswik (1947) has called “the ecology of environ- 
mental events.” A residential treatment center provides 
such an opportunity. The design of the study utilized 
a somewhat representative sampling of the kinds of 





2 The first series was completed within a month; 
the second series took some five months to complete. 

3 In addition to the authors and others already men- 
tioned, Joseph H. Handlon and Jeston Hamer served 
as observers 

‘The problems of making formal observations in 
an on-going clinical operation and the methodological 
issues in pretesting the approach are complex matters. 
Discussions of these questions by A. T. Dittmann and 
by D. W. Goodrich are in preparation. 
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activities around which the children lived their lives 
so as to allow investigation of the relevance of different 
settings for interactive behaviors. This aspect of the 
study will not be discussed here, but the settings should 
be noted: (a) breakfast—an early morning observation; 
(6) snacks in the period just before the children went to 
bed at night; (c) other mealtimes; (d) structured game 
activities, ranging from cards to basketball;’ (e) 
unstructured group activities where the specific ex- 
ternal task structure was minimal—for example, social 
conversations; and (f) an arts-and-crafts period. Since 
one phase of the study occurred during the summer, 
when school was not in session, this last setting was 
utilized as an approximation to an instructional situa- 
tion. Since selection was in terms of kind of activity, 
the data are not representative of time as a time sam- 
ple would be; the nature of the settings did, however, 
insure that various times of the day were included. 
Relevant settings omitted from the study were psycho- 
therapy, other two-person group situations, where the 
presence of an observer might be intrusive, and school 
situations other than arts and crafts. 


The Coding Scheme 


The search for methods for describing human interac- 
tions and the problems involved in extant approaches 
have been reviewed by Heyns and Lippitt (1954). 
The present need was for a scheme (a) applicable to a 
wide variety of situations, (b) relevant for the study of 
personality and individual behavior patterns, (c) suit- 
able for dealing with the behavior of both children and 
adults, and (d) relatively comprehensive. The approach 
initially described by Freedman et al. (1951), and dis- 
cussed in detail by Leary (1957), was used. 

The scheme (Fig. 1) is based on two polar coordi- 
nates. One is along the dimension of affection: love 
(affiliate, act friendly) to hate (attack, act unfriendly). 
The other axis is concerned with status: dominate 
(command, high status action) to submit (obey, low 
status action). Each action of one person toward 
another is coded by letter into one of 16 categories 
along the periphery of the circle in accordance with 
its blending of the two coordinates. The words below 
the letters are simply examples of the kinds of actions 
that might be coded at that position. In practice, 
coding was generally a compromise between the words 
representing the categories and the position relative 
to the two axes, but in cases of doubt the position was 
utilized rather than the words.® 

As in the Bales scheme (1950), the attitude taken by 
the coder is that of the “generalized other.” The ques- 
tion he attempts to answer via his categorization is, 
“What is this person doing to the other? What kind of 
relationship is he attempting to establish through this 
particular behavior?” (Heyns & Lippitt, 1954, p. 91). 
For example, when a child says, “Wasn’t that a good 
movie we saw last night?” he is generally not coded 
5 Behaviors specific to the game itself/—for example, 
passing a ball, playing a card, or claiming one’s turn— 
were not coded or considered in the analyses to follow. 

* Practical coding problems were sometimes re- 
solved by double codings, but such solutions are not 
wholly adequate. 
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Modified from Freedman et al. (1951) and Leary (1957, p. 65) 
J (Fig. 1), although J can include asking someone’s_ or a behavior that is qualitatively inappropriate such 





opinion; he is rather coded M, which is simple affilia 
tion. Or when a child says to another, “I can kick better 
rather than simply stating an opinion P, 

establishing a dominant, slightly hostile 


than you,” 
he is usually 
relationship B, although within some contexts such a 
statement might involve more of an aggressive element 
than 
D. The statement, “I don’t 
may represent active rejection C, whiny complaining F, 
or very passive withdrawal H, depending on the context 


and on the quality with which it was said 


one of status differentiation and so might be coded 
want to play with you,”’ 


In addition to the categorization described above, a 


form of “intensity” coding was employed simultane- 


ously. Each interaction was also coded as to whether 
the behavior was (a) uninvolved—for example, a very 
casual “Hi” or a casual ignoring of another’s statement 
or request, (b) involved and appropriate, or (c) involved 


and inappropriate—either an overly intense behavior 


as responding to an affectionate gesture with an attack 
These latter codings are rather crude, and the aspects 

and intensity are 
limitations, we 


of involvedness, appropriateness, 


confounded. Recognizing these can 
still make use of these categories over a wide sampling 
of behavior, and in point of fact, they do add another 
useful dimension to our findings. 

Coding was done from tapes or typescripts of the 
dictated observations. Each observation was coded by 
at least two coders working together. The coders read 
the protocol (or listened to the tape) line by line, coding 
each interaction in terms of the person behaving 
(the specific child or adult), the interpersonal quality 
and “intensity” of the behavior, and the person toward 
whom the behavior was directed. Thus, one may ob- 
tain in chronological sequence the interactions of any 
given child toward any other child or adult, and also 
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the behavior of others, children or adults, towards 


him.’ 


7 Freedman et al. (1951, p. 155) present some data 
on interrater agreement in coding verbal behavior of 
adults. Adequate assessment of reliability is made 
difficult by the fact that there is no baseline for evaluat- 
ing correlational indices with data scored in this fashion. 
Rank order correlations between codings from proto- 
cols of independent observers of the same events and 
between pairs of coders of the same protocols are in- 
variably high, but they are likely to be somewhat spur- 
ious, since there is some doubt that correlations be- 
tween random protocols are of zero order. Dittmann 
(in press) discusses a number of aspects of the reliabil- 
ity of the system. Considering agreement between two 
pairs of coders working with the same material, Ditt- 
mann notes that item-by-item agreement, analogous 
to test item reliability, is far greater than chance ex- 
pectancy, but he also notes that there remain appre- 
ciable discrepancies. When, however, the single interac- 
tions are grouped to form a profile for an observation, a 
situation analogous to test reliability, discrepancies 
between independent pairs of coders are far smaller 
than could be expected by chance. Similarly, the proto- 
cols from independent observers of the same events 
yield smaller differences than could be expected by 
chance. A recent check by the present authors compared 
different observers who observed the same children in 
matched, rather than identical, settings—a situation 
analogous to alternate forms. Differences, which 
might have resulted from observer variations or from 
lack of equivalence in the matched settings or from 
both, were well within chance limits. Furthermore, a 
series of observations made approximately two months 
apart with the same children also failed to yield sig- 
nificant differences. 

Clearly, none of these results allows a statement 
about level of reliability comparable to the usual 
Pearson r. The significant item-by-item agreement, 
together with the failure to find evidence of bias in the 
sources tested, and together with the finding of consist- 
ent individual differences, discussed below, would, 
however, seem to warrant the conclusion reached by 
Dittmann (1958). The conclusion is that reliability 
is adequate for grouped data—such as the individual 
profiles considered in the present study—although it 
may not be adequate for the analysis of single se- 
quences 

There remains a question of possible bias occurring 
between the two phases of the study, and the only 
answers at present are indirect ones. First, the scheme 
seems fairly objective; second, raters have been aware 
of the problem, and their continued sensitivity to the 
possibility of bias has probably served to reduce that 
possibility; third, the data yield negative as well as 
positive results, whereas a bias would be likely to 
operate more consistently. It is recognized that such 
answers are incomplete. A more definitive check, 
which has awaited the presence of uncontaminated 
raters for whom protocols can be adequately disguised, 
is currently in process 


RESULTS AND DISCUSSION 


In the discussion that follows, the basic 
source of material was, for each child, the 24 
protocols—12 in each phase—in which he 
served as the central focus for observation. 
For analyzing qualitative changes, the inter- 
actions of individual children were distributed 
into the four quadrants of the circle (Fig. 1); 
frequencies at the midpoints (M, I, E, and A) 
were divided evenly between the two adjacent 
quadrants with any remaining odd entry 
randomly assigned.* Further data from indi- 
vidual segments of the 16-category scheme are 
noted occasionally, where this would seem to 
provide clarification. The approach to analyz- 
ing the reaction of others to individual children 
is commented on later. 

The statistical method employed was chi 
square which, though it fails to take into ac- 
count the continuity postulated in the circle 
scheme, involves few assumptions as to the 
nature of the data. The indices suggested by 
Leary (1957, pp. 68-71), while mathematically 
more elegant, would seem to require assump- 
tions even beyond those of ordinal classifica- 
tion. The data for each child were analyzed 
independently in order to avoid confounding. 
Where individual chi squares for each child 
are summed to yield a total estimate of group 
change, the formula is the sum of chi squares in 
one direction minus the sum of chi squares in 
the other direction. Since there are no rational 
bases for expected distributions of behavior 
any assumption that behavior should be 
distributed equally into each category is 
obviously untenable—total marginal distribu- 
tions for individual children were used in 
obtaining theoretical values. Significance tests 
are reported for two-tailed distributions, for 
although some expectations are rather ob- 
viously directional in studying behavior change 
for this group of children, we were at this 
stage interested in exploring both sides of any 
coins which seemed worth examining. 


®For the early phase observations, the median 
was 98 interactions for a child toward other children 
(range 62 to 141), and the median was 89 interactions 
toward adults (range 66 to 143). In the observations 
made 18 months later, the median number of interac 
tions toward children was 77 (range 53 to 89); toward 
adults, it was 69 (range 61 to 98) 
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Changes in Interactions Toward Adults 


Hostile-dominant interactions. Figure 2 pre- 
sents for each of the children the percentage 
of his total responses toward adults which 
were coded as hostile-dominant at each of the 
two phases. The category represents active 
forms of aggressive behavior. Included are such 
interactions as actively refusing to comply 
with adult requests, making boastful demands 
on adults, threatening or challenging adults, 
attempting to “argue down” an adult, poking 
unfriendly fun at an adult, ordering adults 
around in a boastful or unfriendly manner, and 
any attack on the adult in his role of authority 
with the attempt to negate or degrade the 
authority component. Difficulties in authority 
relationships were prominent in the case 
records of all six boys prior to their admission. 
In one sense, the chief symptoms which entered 
into their selection were their consistent failure 
to accept the roles that adults define for 
children and their active rebellion against adult 
authority. 

In the early phase, the mean percentage of 
hostile-dominant interactions toward adults 
was 28. This proportion is, of course, difficult 
to evaluate without control studies of more 
normal pre-adolescents in a comparable 
environment. For the clinical staff, however, it 
is a patent understatement to say that the 
amount of active hostility shown by these 
children was high.’ In the later phase, the mean 
percentage of hostile-dominant responses to- 
ward adults dropped to 11. Over the year and a 


* Preliminary data from control studies confirm 


staff impressions 
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half between the two phases, each of the 
children changed in the expected direction. 
The shift was examined for each child sep- 
arately by chi square—comparing frequencies 
of hostile-dominant vs. all other behavior 
toward adults over the two phases. The 
changes were significant for one child at p < 
.001, for two children at p < .01, for one child 
at p < .02, and for two children at p < .20. 
The sum of the individual chi squares was 
44.23, which with 6 df (one for each two-by- 
two table) indicates a change at a level of 
confidence well beyond .001. 

Hostile-passive interactions. Hostile-passive 
modes of expressing hostility were, in general, 
less prominent in the relations of the children 
with adults than were the more dominant 
forms of aggression discussed above. Such 
behaviors as whining and complaining in rela- 
tion to adults, accusing adults of punitive 
behavior or attitudes, demanding something 
of an adult in such a way as to imply that the 
adult is an ungiving monster, sulky with- 
drawal from interaction, tearful refusals 
these constituted a mean of 14% of the 
behavior toward adults in the earlier phase 
(Fig. 3). A year and a half later, the mean 
percentage had dropped to 8. Four of the six 
children showed a decrease in passive ex- 
pressions of aggression—one at p < .01, one 
at p < .02, one at p < .10, and one at p < .20; 
two children showed an insignificant increase 
(¢ < .50 in both cases). The sum of the four 
chi squares in the expected direction minus 
the two chi squares in the contrary direction is 
17.62, which with 6 df is significant at a 
confidence level beyond .01. Thus, it would 
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Fic. 4. CHANGES IN PROPORTIONS OF FRIENDLY- 
PassIVE INTERACTIONS BY CHILDREN TO ADULTS 


seem that while the magnitude of decrease in 
passively hostile interactions toward adults was 
significant, the change was not nearly as great 
as in the case of dominant expressions of 
hostility. 

Friendly-passive interactions. It is interesting 
that despite the fact that these were hyper- 
aggressive children, selected because of their 
unmanageability, the modal response toward 
adults was in the friendly-passive category 
(Fig. 4). Even in the earlier phase, friendly- 
passive behaviors made up the greatest pro- 
portion of interactions with adults for each of 
the children with a mean of 43%. Examination 
of the total interactions toward adults of all 
six children when each was in the primary 
focus of observation indicates that in the earlier 
phase the three highest ranking of the 16 
categories (Fig. 1) were affiliative behaviors 
(M), cooperative behaviors (L), and help- 
seeking behaviors (K), which produced re- 
spectively 15, 14, and 12% of all interactions 
toward adults.!° This phenomenon, rather than 
indicating that the children were not “really” 
hyperaggressive, would seem to point in two 
directions. First, these are children and their 
behavior in many ways must resemble that 
of other children of their age. Second, each 
interaction in these analyses carries a single 
weight, and while this arrangement serves its 
purposes, it is unlikely that the recipient, for 
example, is equally impressed by one friendly 


10 Active resistance or disagreement (B) was next 
with 10%. 
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ADULTS 


“hello” as by a single attack of murderous rage. 
The relatively high frequencies of friendly- 
passive behaviors do not negate the difficulties 
in living with these children. The problem is 
that there are no units for effectively gauging 
the psychological impact of an action on the 
recipient. 

By the time of the later period, the per- 
centage of friendly-passive interactions with 
adults had risen to a mean of 63%. Each of the 
children showed an increase in such responses— 
one at p < .001, twoat p < .01, one at p < .05, 
one at p < .10, and one at p < .30. The sum 
of the chi squares is 45.37, which with 6 df is 
significant at a confidence level well beyond 
001. Further consideration of the specific 
nature of the changes appears below. 

Friendly-dominant interactions. It is difficult 
to know what to expect in the friendly-domi- 
nant area, both in proportions and in actual 
changes. In contrast with the situation for 
aggression and dependency, there has been 
little theoretical or research interest in such 
children’s behaviors as sympathizing with or 
reassuring adults, offering help to adults, and 
teaching or advising adults. One might guess 
that such actions are perhaps not very ap- 
propriate as a major aspect of the relations of 
pre-adolescent boys with adults. For the 
present group, friendly-dominant responses 
constituted a rather small proportion of be- 
haviors toward adults. The values in Fig. 5 
and the mean values—16% in the early and 
18% in the later phase—are somewhat spuri- 
ously high because the major contributory 
entry was from affiliative responses (M), which 
in the analyses were distributed equally be- 
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tween friendly-dominant and friendly-passive 
interactions." Only for one child, Bruce, did 
the change between the two phases approach 
significance (p < .10), and here it was almost 
wholly a function of the increase in affiliative 
responses (M). The sum of the four chi squares 
showing increase minus the two showing de- 


crease was 2.95, which is not significant 
(p < .90). 

The “intensity” dimension. Most of the 
children’s behavior was considered by the 


coders to be appropriate and involved. In the 
early phase, this “intensity” category com- 
prised a mean of 75% of the children’s action 
toward adults (Fig. 6). Only 9% of the be- 
havior in this period—a mean for the six 
children inappropriate and 
involved, that is, as being overly intense or 
inappropriate to the circum- 
interactions yielded a 


was coded as 
qualitatively 
stances. Uninvolved 
mean of 16% of the behaviors in this phase; 
these included such actions as silent rejec tions 
of adult requests, subtle provocations, as well 
as token gestures of acceptance or affiliation, 
and the term uninvolved is not very adequate. 

In any case, the proportions of appropriate- 
involved interactions increased in the later 
phase from a mean of 75% to a mean of 86%. 
Each of the children changed in the expected 
direction. Comparing the frequencies of ap- 
propriate-involved behaviors with the summed 
frequenc ies of the other two ,categories over 
con- 


the two phases yielded chi squares at 


The total of the three categories, N, O, and P, 
contributed 7% to the total number of interactions in 
each of the two phases, whereas M contributed 15% 
and 21% to the early and late phase, respectively 


fidence levels of p < .01 for one child, p < .05 
for two children, p < .20 for two children, and 
p < .90 for the sixth child (two-tailed tests). 
The sum of the chi squares was 21.19, which 
with 6 df is significant-at p < .01. Five children 
showed a decrease in inappropriate-involved 
behavior; the means went from 9% to 4%. All 
six children showed a decrease in uninvolved 
actions toward adults, the means dropping 
from 16% to 10%. The raw frequencies are 
in some cases too small to warrant statistical 
analysis, but the trends seem obvious. 

Summary comments on changes in interactions 
with adults. Over the year and a half the 
children changed considerably in their be- 
havior toward adults. Primarily, they lessened 
their attempts to dominate adults aggres- 
sively, and they increased their friendly and 
compliant associations with adults. Passive 
expressions of hostility also decreased, and 
in general, behavior became more appropriate, 
but these latter changes, while statistically 
significant, were less striking. 

It is obvious that improvement occurred in 
overt behavior toward adults. During the 
period of the study, however, the boys had 
not only been under an intensive residential 
treatment program, but they had also grown 
older. Where so little is known in any syste- 
matic way about the interpersonal behavior 
of normal children and about developmental 
changes in such behavior, there is the per- 
plexing question that is often put as an issue 
between maturation and learning (or treat- 
ment). The question is a complex one, since 
social maturation can never be divorced from 
considerations of the particular environment 
involved and its indulgences, tolerances, and 
demands. For example, the treatment of 
children, in order to be adequate, must take 
growth and development into account. Con- 
versely, maturational phenomena will be in- 
fluenced by the environmental medium in 
which they occur. In the present situation, one 
may legitimately ask: Are the changes neces- 
sarily to be attributed to the treatment pro- 
gram together with growth and development, 
or are they, perhaps, primarily a function of 
normal development within a somewhat benign 
environment? No definitive answer can be 
given without control studies, and there are 
manifold problems even so. There are, how- 
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ever, some partial cues in the data which in- 
dicate that the treatment, in a broad sense, 
is a critical factor. Thus, while one might 
possibly expect decreased aggression and de- 
creased dominance in relation to adults as 
part of a “normal” growth process, though the 
argument would be tenuous, one would not 
expect to find much increase in trusting, de- 
pendent relations with adults. It is interesting 
that out of the 16 possibilities (Fig. 1), the 
category that showed the greatest shift was 
K, which deals with requesting, depending, 
and asking-help behavior, and that the per- 
centage of responses in this category went from 
12% of all behaviors toward adults in the early 
phase to 24% in the later phase. In frequency 
of responses, K shifted from third to first rank. 
It would seem that this was not a maturation 
phenomenon—that is, it is unlikely that 
children either become increasingly dependent 
or increasingly admit their dependency on 
adults with age. One may speculate that the 
evidence points to the dissolution of a defen- 
sive layer so that dependency emerges; such 
a speculation dovetails with the impressions 
of the psychotherapists that oral themes be- 
came very prominent in the later phases of 
psychotherapy with these children. The critical 
question about what factors in the treatment 
program contributed to change can unfortu- 
nately not be answered by a study such as 
this, but some further issues in the process of 
change will be considered below in the discus- 
sion of behaviors directed toward each of the 
children. 
Changes in Interactions Toward Peers 

We turn to the behavior of the children 
toward their peers, again considering each 
child when he was the focal subject for observa- 
tion, but now investigating changes in the 
behavior that he directed toward other chil- 
dren. The discussion can be briefer in this 
section because some of the contextual ma- 
terial has already been presented. 

Hostile-dominant interactions. Figure 7 pre- 
sents for each child at the two phases the 
percentage of his actions which were hostile- 
dominant in orientation toward other children. 
In both phases the means are higher than was 
the case for behavior toward adults—39% as 
compared to 28% in the early phase, and 32% 
as compared to 11% in the later phase. Five 
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of the six boys were in the early period more 
dominantly aggressive toward their peers than 
toward adults, and all six showed this trend 
in the later period. That is, aggressive attempts 
at dominance were, as might be expected, more 
readily expressed toward peers than toward 
the higher status adults. 

All six of the children showed expected de- 
cremental changes over the 18 months. While 
this directional consistency for six cases is sig- 
nificant at a confidence level of p = .03 by a 
simple two-tailed binomial test, none of the 
individual changes reached statistical sig- 
nificance, and the sum of the individual chi 
squares was also nonsignificant. Thus, in con- 
trast to the marked shifts between the two 
phases in proportions of hostile-dominant inter- 
actions toward adults, the changes in inter- 
actions with peers were, at the most, slight. 

Hostile-passive interactions. Although four 
of the six children showed a decrease in hostile- 
passive behaviors (Fig. 8), in no case was the 
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change significant. One child, Clif, increased 
in passive expressions of aggression toward the 
others, and the shift was significant by chi 
square at p < .001. For this child, both passive 
resistance (F) and withdrawal (H) increased 
over the year and a half. His over-all propor- 
tion of hostility rose somewhat, though not 
significantly, since a slight decline in hostile- 
dominant interactions did not compensate for 
the increased passive hostility. The situation 
thus points to the possibility of a slight de- 
terioration in peer relationships for this child 
over the period of the study. The mean per- 
centages of hostile-passive responses for the 
six children are the same in the two phases, 
12%, and the sum of the chi squares for the 
positive change direction minus those in the 
negative direction is, although opposite to the 
direction of improvement because of Clif’s 
contribution, not significant. 

Friendly-passive interactions. Just as active 
aggression was more characteristic in response 
to peers than to adults, so friendly-passive 
interactions, at the opposite side of the circle, 
were less characteristic in peer relationships 
than in relationships with adults. In the early 
phase, friendly-passive actions made up 43% 
of the responses to adults and 24% of the 
responses to other children; in the later period, 
the respective percentages were 63 and 29. 
In behavior toward adults, Categories M, L, 
and K (Fig. 1) held the three highest ranks for 
response frequencies in each of the two phases, 
based on the total number of responses of all 
children in each phase. In behaviors toward 
peers, M (affiliative responses) and L (coopera- 
tive responses) were at Ranks 1 and 4.5 in the 
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early phase and at Ranks 1 and 2.5 in the later 
phase. But the position of K (trusting, de- 
pendent, help-seeking responses) was quite 
different in relations with peers. In the early 
phase, only 2% of all responses were coded K, 
a ranking of 14 in a triple tie; in the later 
phase, only 3% of all responses was coded K, 
a ranking of 11.5 in a quadruple tie. Thus, 
unlike the case in relation to adults, de- 
pendency responses among the children oc- 
curred relatively infrequently, and they showed 
rather little shift between the two times of 
study. 

Four boys showed gains in the proportions 
of friendly-passive behaviors toward peers 
(Fig. 9), and for two of them the change was 
significant by chi square at p < .05. None of 
the other shifts were statistically significant; 
the sum of the four chi squares in the expected 
direction minus the two in the opposite direc- 
tion was 8.87 which, with 6 df, yields a con- 
fidence level of p < .20. 

Friendly-dominant interactions. One might 
expect the friendly-dominant mode of inter- 
action to be more characteristic of relations 
among children than from children to adults, 
and such an expectation appears to be legiti- 
mate. In the early phase, an average of 26% 
of the children’s responses to peers were 
friendly-dominant in orientation in contrast 
to the average of 16% in responses to adults; 
in the later phase, the two averages were 28% 
and 18%, respectively. The category M again 
constituted a major source of data for the 
quadrant, and that the values are not deceiving 
is shown by the fact that Categories N, O, and 
P yielded 15% of all responses toward children 
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in each of the two phases and 7% of all re- 
sponses toward adults in each of the two 
phases. 

Figure 10 shows four of the six changes to 
be in the direction of an increase in friendly- 
dominant behaviors toward peers, but none of 
the changes in either direction is significant; 
neither is the sum of the chi squares. 

The “intensity” dimension. The distributions 
of responses toward peers are strikingly similar 
to the distributions of responses toward adults 
in the three-category “intensity” classification. 
In the early phase, the mean proportions of 
interactions toward children were 74% ap- 
propriate-involved (as compared to 75% for 
behavior toward adults), 16% uninvolved in- 
teractions (as compared to 16% for behavior 
toward adults), and 9% inappropriate-in- 
volved interactions (as compared to 9% for 
behavior toward adults). In the later phase, 
the three means are, respectively, 84%, 11%, 
and 5% as compared to 86%, 10%, and 4% in 
behavior toward adults. 

Figure 11 shows the shifts in the proportions 
of appropriate-involved behaviors for each 
of the children. All changes were in the ex- 
pected direction of increase. For one child 
the change was significant at p < .02, for 
another at p < .05, and for a third at p < .10; 
for the other three children there is less con- 
clusive evidence of change. The sum of the 
chi squares is 15.85, which, with 6 df, is sig- 
nificant at a level of p < .02. Decreases. in 
inappropriate-involved and in uninvolved ac- 
tions toward peers seem by inspection to be 
less consistent than they were in behavior 
toward adults. 

Summary comments on changes in interactions 
toward peers. Ceriainly, the evidence of change 
in relations with peers is much less than in the 
case of relations with adults. The only general 
change in peer directed action that warrants 
much confidence was the increase in the rela- 
tive proportion of appropriate behavior. In 
other aspects, the directions were similar to 
those for behavior toward adults—toward a 
decrease in dominant aggressive actions and 
toward an increase in friendly compliant ac- 
tions—but the shifts over the year and a half 
were, for the most part, unimpressive. 

To the question of why changes in relations 
to adults were so much more striking than were 
changes in peer relations, no clear answer can 
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be given. There is the possibility that inter- 
personal behavior with peers was less dis- 
turbed than that with adults. Control studies 
would be required to confirm or disaffirm this 
statement. There is also the possibility that 
changes occur earlier in the treatment process 
in relation to adults than in relation to peers. 
Such an hypothesis would be reasonable 
though not necessary on the basis of clinical 
evidence that the difficulties of these boys de- 
veloped out of primary relationships with 
parental figures. Perhaps, with these primitive 
children, some resolution of earlier relationship 
problems must occur before the genetically 
more advanced problems of peer relationships 
can be met. Follow-up studies and studies of 
other clinical groups would be useful here. 
There is also a possibility that the observation 
method and the instrument are less potent 
for gauging peer interactions than for inter- 
actions of children toward adults. Some light 
is cast on this latter issue in the discussions that 
follow. 


FURTHER EXPLORATIONS 


The aim in the following sections is to explore 
some phenomena of group and individual inter- 
personal behavior. Such exploration can per- 
haps illuminate some of the findings already 
referred to. It can point to further potentialities 
or limitations of the methods. Most important, 
it can probe toward the formulation of general 
principles. 


Group Similarities 


The selection of the children and their en- 
vironmental homogeneity within the institu- 
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tion should make for behavioral similarities in 
their interactions. That the children did re- 
semble one another in interpersonal behavior 
is shown as follows: When the 16 categories 
were ranked for each child in accordance with 
the frequencies of his interactions, Kendall 
Ws, computed for the six children and cor- 
rected for ties, were .61 for interactions toward 
peers and .59 for interactions toward adults 
in the early phase; in the later phase, the cor- 
responding Ws were .61 and .76. All Ws are 
significant at levels of p < .001. Furthermore, 
behavior between the two phases correlated 
significantly. When the interactions for the 
group taken as a whole in the 16 categories were 
ranked for each phase, the Spearman rho 
corrected for ties, for behavior toward peers 
was .7/9. Even in the interactions 
toward adults, where the differences between 
phases were highly significant, the rank orders 


case of 


of interaction categories were similar between 
the two periods a year and a half apart. The 
rho, corrected for ties, was .75." Thus, although 
interpersonal behavior may change over time, 
it also seems to maintain a certain consistency. 

rhe and 
the consistencies in group behavior raise a 
number of questions. We do not know to what 
extent correspondences are a function of the 


concordances among individuals 


homogeneity of the group and of the environ- 
mental milieu, or, on the other hand, to what 
extent they are related to the facts that the 


subjects are children of a given age, or children 
in a specific culture, or that they are simply 
children. Speculations here would perhaps be 


promiscuous. There are, however, some data 
presented by Barker and Wright (1954, Ch. 
12) which offer possibilities for very crude com- 

[These authors show rankings of inter- 
categories for the behavior of eight 
each of ob- 


parison 
active 
“normal” children, 
served individually for a day in the small 
Midwestern town in which the children lived. 
All interactive behavior was noted and later 
coded. Four of the children were boys and four 
were girls; ages range from 1 year, 10 months 
through 10 years, 9 months, with a median 
age of 5 years, 10 months. Thus, the children 


whom was 


and their environments were very different 
from those of the present study. The behavioral 


categories employed in the two studies also 
significance at p = .01 


2 The critical value for 


one-tailed test) is a rho of .60 


differed, but on an a priori, face-validity basis, 
it was decided which categories were compar- 
able in the two schemes." The rank orders of 
interactive behavior for these two different 
samples of children in different environments 
were correlated. Behavior toward adults 
yielded rhos of .24 and .31 between the Barker 
and Wright data (1954, p. 429) and the present 
study’s early and late phases, respectively; 
behavior toward children yielded rhos of .43 
and .29. Although none of the rhos is sig- 
nificant with eight sets of ranks, all are positive. 
By themselves, correlation co- 
efficients can be artifacts of the categories 
employed; for example, by choosing walking 
and flying as categories for classification, it 
can be shown—not wholly illegitimately 
that people are all alike. ‘The matter is, how- 
ever, not so readily dismissed. In the course of 
the day, the Barker and Wright children had 
the same was true 


posit ive 


transactions with adults; 
for the six boys in the present study. When the 
total adult behaviors toward the Barker and 
Wright children (1954, p. 425) were compared 
in rankings with the total adult behaviors 
directed toward the six hyperaggressive boys, 
the rhos were .93 (p < .01) for the early phase 
and .79 (p < .05) for the later phase. That is, 
despite the differences in children in age and 
psychological status, despite the differences in 
environments (home, school, and neighborhood 
versus a psychiatric milieu), despite the dif- 
ferences in adults (parents, neighbors, and 
teachers as against counselors, nurses, phy- 
sicians, and teachers), and despite differences 
in methods of study, patterns of behavior of 
adults toward children apparently have much 
in common. The patterns of behavior of the 
very different samples of children would seem 
to have though still something—in 
common. In isolation, such findings are difficult 
to evaluate. They do, however, point to the 
desirability of investigating commonalities and 
differences in interpersonal patterns in relation 
to maturational and in relation to 
cultural and subcultural variations. 


less 


factors 


13 Dominance in the Barker and Wright scheme was 
considered equal to the sum of A and P (Fig. 1), 
Appeal = K and J, Resistance = B, C, and F, Nur- 
turance = N and O, Aggression = D and E, Sub- 
mission = I, Compliance = L, and Avoidance = 
G and H. Affiliative responses, M, were ignored, since 
affection is treated under a different coding system by 
Barker and Wright (1954, Ch. 10) 
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Individual Differences 


Another side of the coin is the matter of in- 
dividual differences. The side one emphasizes, 
resemblances or differences, depends on what 
one wishes to talk about. General arguments 
as to which are greater are specious, a pitfall 
researchers have not always avoided. For a 
scheme to be maximally useful, it should be 
capable both of abstracting general phe- 
nomena and of demonstrating differentiations, 
and, all other things being equal, the finer its 
discriminations, the greater its potential. An 
initial and minimal test of the clinical utility 
of an instrument is its capacity for showing 
individual differences. Considering the earlier 
observations by quadrants, the six children 
differed among themselves in both interactions 
toward each other (Table 1, p < .001) and 
in interactions toward adults (Table 2, 
p < .01). The “intensity” codings seemed less 
sensitive to individual factors (Table 3, p < .01 
in interactions with children; Table 4, p < .10 
in interactions with adults). Individual dif- 
ferences among the children also appeared in 
the later phase in interactions toward peers 
(Table 5, » < .01) and in interactions toward 
adults (Table 6, p < .05), but they seemed 
somewhat attenuated, perhaps by the long 
period of close communal living. The “in- 
tensity” codings yielded results similar to the 
earlier phase (Table 7, p < .01 for interactions 
with peers; Table 8, p < .50 for interactions 


TABLE 1 
INDIVIDUAL DIFFERENCES IN MODES OF INTERACTION 
Towarp PEERS—EARLY PHASE 
(N = 568) 
Friendly- | Friendly- 


passive | dominant 
Actions | Actions 


Hostile 
passive 
Actions 


Hostile- 
dominant 
Actions 


Tony 19 24 35 24 
(38. (11.49) | (24.78 (26.76) 
Bruce 33 8 10 11 
(23 (6.99 (15. (16. 
37 3 26 32 
37.44) (11.04) | (23. 
34 12 17 35 
37.44) (11.04) | (23. (25.7 
26 1( 18 13 
25.60) (7.55) | (16.28 (17.5 
Frank 68 32 34 
(53.87) 5.89) (34.26) (36.99) 


(25.7 


* Expected values based on marginal distributions are in 


parentheses. x? 60.13, df 15, p < .001 


TABLE 2 


INDIVIDUAL DIFFERENCES IN MODES OF INTERACTION 
TOWARD ADULTS—EARLY PHASE 
(N = 542) 


Hostile- | Hostile- | Friendly- | Friendly- 
dominant | passive | passive | dominant 
Actions | Actions | Actions | Actions 


Tony | 24 | 10 | 40 15 
(24.96)* | (12.97) | (36.29) | (14.78) 
Bruce 23 25 | 3 8 
| (24.96) | (12.97) | (36.29) | (14.78) 
Clif 15 4 | 27 10 
.70) .16) | (22.83) (9.30) 
| 36 


6 ig 11 
.51) .62) | (26.91) | (10.96) 
Ed | 45 


32 
.10) 0.84) | .31) | (23.75) 
Frank 35 40 


| 
| 14 
.76) | (14.43) | (40.37) 
' 


Dave 


(16.44) 


* Expected values based on marginal distributions are in 


parentheses. x? = 34.94, df = 15, p < .01. 


TABLE 3 
INDIVIDUAL DIFFERENCES IN “INTENSITY” OF INTER- 


ACTION TOWARD PEERS—EARLY PHASE 
(N = 568) 


Involved- Involved-in- 
appropriate appropriate 
Actions Actions 


Uninvolved 
Actions 


Tony 12 
(16.34)* 
Bruce 14 
(9.93) 
Clif 24 2 
(15.70) .50) (8. 
Dave 14 14 
(15.70) 73.50) (8. 
7 5 S 
(10.73) .25) (6. 
Frank 20 15 
(22.59) (105.75) (12. 


76.50) . 16) 


. 50) (3.4 


Ed 


* Expected values based on marginal distributions are in 
df = 10, p < 01 


parentheses. x? = 24.71 
with adults), but again perhaps somewhat 
attenuated. 

In all these analyses, individual differences 
in behaviors toward children seem greater than 
do individual differences in behaviors toward 
adults. This trend would indicate that those 
failures to find significant differences in peer 
relations between the two phases, as compared 
to the significant phase differences in relations 
with adults, were not a function of the lack 
of sensitivity of the method to peer inter- 
actions. On a theoretical level, the trend may 
point to a more general issue of role relation- 
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TABLE 4 
INDIVIDUAL DIFFERENCES IN “INTENSITY” OF INTER- 
ACTION TowaRD ApULTS—EARLY PHASE 
(N = 542) 


Involved- Involved-in- 
appropriate appropriate 
Actions Actions 


Uninvolved 
Actions 


Tony 14 69 6 
(14 (65.19) (9.20) 

Bruce 18 59 12 
(14 (65.19) (9.20) 

5 47 4 
(9. (41.02) (5.79) 

Dave 14 50 2 
(10.84) (48.34) (6.82) 

Ed 23 97 23 
(23. (104.7 (14.77) 

Frank 15 75 9 
(16.26) (72.51) (10.23 


Clif 


* Expected values based on marginal distributions are in 
parentheses. x* = 16.83, df = 10, » < .10 


TABLE 5 


INDIVIDUAL DIFFERENCES IN Mopes OF INTERACTION 
Towarp Preers—LATER PHASE 
(N = 430) 


Hostile- | Friendly- |Friendly- 
passive | passive | dominant 
Actions | Actions | Actions 


Hostile- 
dominant 
Actions 


Tony 13 20 18 
(6.84) (17.5 (16.60) 
Bruce 2! 5 11 
6 (15.53 
Clif 15 7 
.40) (8.43 
Dave 6 
26.57) (9.57) 
Ed 23 6 
(24.99) (9.00) a. 88 (21.86) 
Frank 30 4 ‘ 18 
(25.30) (9.12) 3.44) | (22.14) 


* Expected values based on marginal! distributions are in 
parentheses. x* = 36.88, df = 15, » < 01. 


ships. We may speculate that people will tend 
to act in a more individualized fashion toward 
those in their own status group than they do 
toward groups of rather different statuses. 
That is, not only may there be a cognitive 
tendency for one group to stereotype another 
group of very different status, but patterns of 
behavior in a group may be less variable in 
relations to other groups than in relations to 
peers. To the other-status recipient of be- 
havior, the behavioral cohesiveness of a group 
may then appear to be greater than it actually 
does to the within-group members. 


TABLE 6 
INDIVIDUAL DIFFERENCES IN MopEs oF INTERACTION 
TowarD ApULTS—LATER PHASE 
(N = 438) 


Hostile- | Friendly- | Friendly- 
passive | passive | dominant 
Actions | Actions | Actions 


| Hostile- 

| dominant | 
Actions 

9 15 

(10.96)* (8.50) 

13 

(8.61) 

6 


(7.83) | 
6 | 


(6.82) 
Ed : 

(7.50) 
Frank 12 

(7.27) | | (11.58) 


* Expected values based on marginal distributions are in 
parentheses. x? = 27.81, df = 15, p < .05S. 


TABLE 7 


INDIVIDUAL DIFFERENCES IN “INTENSITY” OF INTER- 
ACTION TOWARD PEERS—LATER PHASE 
(N = 430) 





Involved- 
inappropriate 
Actions 


Involved- 
appropriate 
Actions 


Uninvolved 
Actions 


53 5 
(50.51) (2. 





Tony 


Bruce 7 
: .62) (2 
Clif 2 
.30) (3. 

Dave 3 0 
Ja) (3. 
Ed 


1 
(3. 
4 


Frank 
(3.53) 
* Expected values based on marginal distributions are in 
parentheses. x* = 24.41, df = 10, p < .01. 





Some Dynamic Aspects of Interpersonal Proc- 

esses 

In all the foregoing discussion, it is as though 
each child were an independent, self-de- 
termined unit. Yet one of the basic facts of 
interpersonal processes is the interdependence 
and continuity of behavior. One’s actions evoke 
actions by others, and the actions of others, in 
turn, stimulate one’s own behavior. Leary 
(1957, Ch. 7) has commented on and given 
some examples of reciprocality in interpersonal 
behavior among adults. For cues as to this 
issue, the behavior that each child received 
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TABLE 8 

INDIVIDUAL DIFFERENCES IN “INTENSITY” OF INTER- 
ACTION TowaRD ApULTS—LATER PHASE 

(N = 438) 


Involved- Involved- 
appropriate inappropriate 
Actions Actions 


Uninvolved 
Actions 


Tony 10 84 
(9.8 (83.90) 
Bruce 7 65 

(7. (65.92) 
Clif 4 60 

(7. (59.93) 
Dave 7 

(6.13 
Ed 10 

(6. 
Frank 6 

(6.5 

"Expected values based on marginal distributions are in 

parentheses. x? = 9.55, df = 10, p < .50. 


from his peers and from adults was examined. 
Tables 9 through 12 present the frequencies 
of interactions and percentages in the four 
categories of response for behavior each child 
received from others." 

Reciprocality in adult behavior toward chil- 
dren. The behavior of the adults contrasted 
sharply with that of the children. The majority 
of the adult interactions—a mean over the 
six children of 58% in the early phase and 
72 in the later—were in the friendly-dominant 
category, composed of friendly, nurturant, 
supporting, giving, and guiding activities. 
A further demonstration of the contrast lies in 
the comparison between behavior ‘‘sent’”’ by 
the children and the behavior they “received.” 
In the early phase, a mean of 51% of the 
responses which individual children “sent” 

14 These tables were constructed using the interac- 
tions of the children when they were not in the central 
focus of observation. That is, in tabulating, the interac- 
tions of the central subject were ignored; the interac- 
tions of all other persons, children and adults, were 
tabulated according to whom they were directed 
toward. This arbitrary rule insures that the same data 
which entered into the previous tables (1 through 8) 
do not enter into these tables. At the same time, it 
should be noted that the entries in any cell do not 
necessarily represent the equal contribution of all 
participants to the behavior directed toward a given 
child. For example, Tony may have received 10 hostile- 
dominant actions from Bruce and 5 hostile-dominant 
actions from Frank. Because of such possibilities of 
confounding, these data are not analyzed by chi 
square. The discussion that follows is based on visual 
inspection of the data. 
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TABLE 9 
Actions BY Peers Towarp Eacu Cuitp—EARLY 


Friendly- 
dominant 
Actions 


Friendly- 
passive 
Actions 


Hostile- 
passive 
Actions 


Hostile- 
dominant 
Actions 

Freq 


Freq.| % |Frea. 


Tony 74 | 46 44 
Bruce 51 | 37 31 
Clif 65 | 35 44 
Dave 96 | 42 : 48 
Ed 50 | 40 | 11 37 
Frank 78 | 39 31 | 5 37 


























TABLE 10 
Actions By Aputts Towarp Eaca Caitp—EAarLy 
PHASE 
(N = 557) 


Friendly- 
dominant 
Actions 


Friendly- 
— 
Actions 


Hostile- 
passive 
Actions 


Hostile- 
dominant 
Actions 


pore % \Freq.| % 


or 


Freq % 


| 
16 | 18 65 
19 | 23 52 | 62 
24 | 33 ! 35 
20 | 20 66 
16 | 16 | 5 | 28 | 27 | 54 
| 26 | 24 | | 23 | 21 | 53 




















TABLE 11 
Towarp EAacu 


PHASE 
(N = 920) 


ACTIONS BY PEERS CxuILp-——LATER 


j j l 
| Hostile- |. Hostile- | Friendly- 
| dominant | —— assive 


Friendly- 
dominant 


x 
: , | 
Actions ctions | x tions Actions 


"7" 
— 
@ 


\Freq.| % 
— 

25 | 28 
44 | 32 
58 | 27 
47 | 30 
| 35 | 25 
| 40 | 23 


Aanwoww 
DaAQes5) 





ions BY ApuLTs TowaRp Eacu CHILp—LATER 
PHASE 
(N = 658) 


Friendly- 
dominant 
Actions 


Hostile- Friendly- 
passive —_ 
Actions Actions 


Hostile- 
dominant 
Actions 


Freq.| % 


Freq.| % 


19 | 16 
18 | 15 13 
20 7 
6 18 


8 
22 
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toward each other were hostile in orientation 
(Table 1); they “received” a mean of 51% 
hostile actions from each other (Table 9). 
For the later phase, the values were 44% 
hostile “sent” and 43% “received” 
(Tables 5 and 11). In contrast was the situation 
with adults. Whereas the children in the early 
hostile actions 


responses 


phase “sent” a mean of 42% 
to adults, they “received” a mean of only 25% 
hostile actions in return (Tables 2 and 10). 
For the later phase, the values were 19% 
“sent” and 15 “received” (Tables 6 and 12). 
Although the data are, of course, not definitive, 
they hint that interpersonal 
change, insofar as change occurred, were adult 
child initiated. It would seem 
that if interaction were purely 
reciprocal—an eye for an eye—one could not 
expect changes in the direction of the par- 


processes of 
rather than 
reasonable 


ticipants’ actions, at least after a stable status 
order had been established. It is probably at 
the interruptions in patterns of reciprocality 
that the potential for interpersonal change 
arises. This is not to say that such interrup- 
tions are sufficient requirements for change, 
although they may be necessary ones. 

Nor does the foregoing discussion imply that 
the adults did not, in some measure, respond 
reciprocally to the children. Although the 
adults may have initiated the process of 
change, they, like the children, showed system- 
atic shifts in behavior the 
pha ses. The proportions of aggression expressed 
by adults 
six boys.'® Noted previously was the fact that 


between two 


decreased in relation to each of the 


the major single shift in the 16 categories on 
the part of the children in their behavior 
toward adults was the increase in help-re- 
actions (K); it is in- 


questing, de pe ndent 


‘6 The mutual changes raise the possibility of an 
counter-argument that the children do 
change at all; rather, they simply respond 


interesting 
not “really 
reciprocally to ent” in the adults brought 
} 


about nroug! 


“improven 


1 training and experience. An adequate 


refutation of this view as a complete explanation of 
the changes in the 


would probably require data extraneous to the present 


children’s relations with adults 


of the sequences of interactions 


Note that 


study, although study 
would provide some cues uch a counter 
argument can only arise when there is actual informa 
tion about the behavior of the The usual 
of change is of a constant or 


The data put into 


“others.” 
assumption in studies 
randomly fluctuating environment 


question the tenability of such an assumption 


teresting that the major single category of 
change in adult responses over the two phases 
was the increase in giving and help-offering 
responses (O) from 23 to 34% of the total 
adult behaviors. The adults also changed in 
the “intensity” of behavior toward the chil- 
dren. The proportions of appropriate-involved 
behaviors as compared to uninvolved and 
inappropriate-involved behaviors increased in 
relation to each child, the means going from 
82 to 90%. 

Reciprocality in peer behavior. Changes in 
actions ‘‘received” from peers also, in general, 
paralleled the previously noted changes in 
actions “sent” toward peers. The evidence for 
a general systematic change in the quality of 
interactions toward peers was noted as slight. 
Similarly, there appears to be little over-all 
change in the quality of interactions received 
from peers (Tables 9 and 11). Dave was the 
child who changed most in the direction of 
decreased toward peers. Other 
children changed similarly in their behavior 
toward him. A further parallel appears in the 
case of Clif, who, as mentioned previously, was 
unusual! in that while he decreased in dominant 
forms of aggression toward peers, he showed 


aggression 


significantly more passively hostile actions in 
the later phase. In the later as compared to the 
earlier period, Clif received more dominantly 
hostile but fewer passively hostile actions from 
peers—the only instance where this occurred. 
“Intensity” changes in “received” behavior 
parallel “intensity” changes in “sent” be- 
havior. In all six cases the proportions of ap- 


propriate-involved actions increased. But there 
are deviations from parallelism. For example, 


there is little evidence tt Bruce changed 
much in his actions toward peers. They, how- 
ever, seemed to change toward him. He re- 
ceived less aggression and less inappropriate 
behavior in the later phase. 

In general, then, there is a mutuality in 
interpersonal processes of change. As inter- 
personal behavior changes, new equilibrium 
relations tend to form between the person 
and others. Unfortunately, the study can say 
all too little about the circumstances under 
which changes in equilibria occur, or about 
such problems as lags in mutual rearrange- 
ments. It would also seem that realignments 
may occur on bases which this study missed. 
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One would guess that such factors as inter- 
personal skills and sensitivities, special cog- 
nitive abilities, stability or erraticness of be- 
havior, special friendship and leadership pat- 
terns all play a part. 

What can be said about the specific nature 
of reciprocal action? Obviously, nothing defini- 
tive, but there are some cues. Let us look, for 
example, at Tony’s relations with the other 
boys during the early phase. Tony was the 
least dominantly aggressive and the most 
passively aggressive of the children. Comple- 
mentarily, Tony received the greatest propor- 
tion of dominant aggression and the lowest 
proportion of passive aggression from his 
peers.'® While Tony initiated the highest pro- 
portion of friendly-passive responses to other 
children, he received the lowest proportion of 
such responses from peers but the second 
highest proportion of friendly-dominant be- 
haviors. In contrast to Tony, Frank “sent” 
a high proportion of dominantly aggressive 
and a low proportion of passively aggressive 
behaviors toward the other boys and received 
from them a high proportion of passive-hostile 
and a relatively low proportion of dominant- 
hostile responses. But, as seems true for 
reciprocality in change, patterns of comple- 
mentariness in action are not equally clear for 
all children. 

That passive aggression evokes dominant 
aggression and that dominant aggression 
evokes passive aggression is not unexpected. 
What is more interesting is a tendency in the 
data for passive aggression to evoke friendly- 
dominant behavior, and for dominant aggres- 
sion to evoke friendly-passive behavior. The 
evidence is obviously meagre and the indica- 
tions are not wholly consistent, but it would 
seem that hostile oriented actions were not 
always met with counterhostility. Do “‘suc- 
cesses” maintain the behaviors? If so, how is 
the alternative cycle of reciprocal aggression 
to be counteracted? 


SUMMARY 


An exploratory study was made of the inter- 
personal behavior of six hyperaggressive boys 
in residential treatment. Each child was ob- 


16 Information for these and the following comments 
may be reconstructed from the tables presented. 


served twice in six life settings and his inter- 
actions with both peers and adults were noted. 
The observations were repeated after a year 
and a half in the treatment program. 

Over the year and a half the interpersonal 
behavior of the children shifted considerably. 
The major changes were in the relations of the 
children with adults. Here, there was primarily 
a decrease in hostile-dominant behavior and 
an increase in friendly-passive behavior. The 
appropriateness of behavior increased both in 
relations with children and with adults. The 
patterns of change were consistent with treat- 
ment aims, and they seemed, at least in part, 
a function of the treatment program. 

Group similarities and individual differences 
among the children were noted, and patterns 
of reciprocality in behavior between children 
and adults and among children were explored. 
In relations with peers, the children received 
about the same amount of aggression as they 
expressed. They received less aggression from 
adults than they expressed toward them. 
Changes in patterns of behavior toward others 
were accompanied, in general, by reciprocal 
changes in the behaviors of others, both adults 
and children. 

The study demonstrates (a) that systematic 
observation and coding of the interpersonal 
behavior of a small group of children in natural- 
istic settings can yield tenable descriptions, 
orderly relationships, and some tentative hy- 
potheses about interpersonal processes, (0) 
that hyperaggressive children can change in 
residential treatment in a direction consistent 
with therapeutic aims, and (c) that the mode 
of observation described here, together with 
the scheme for coding interpersonal behavior 
(Freedman et al., 1951; Leary, 1957), has 
some measure of utility. 
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SYCHOLOGISTS have only recently begun 

to show an interest in the relationships 

between the structure of groups and 
their problem-solving effectiveness. Rosenberg, 
Erlick, and Berkowitz (9) have demonstrated 
an “assembly effect” in the group product, an 
effect based not on the characteristics of each 
person but on the relations of the character- 
istics of each individual to those of the others 
in the group. Schutz (10) showed that the per- 
formances of “compatible” groups were 
superior to those of “incompatible” groups on 
two tasks. In these studies, as in most problem- 
solving experiments, the nature of the tasks 
used was undefined. Certain questions remain 
unanswered: Does an “assembly effect” occur 
on all tasks? Are “compatible” groups superior 
on every type of problem? 

The present experiment examined the in- 
terrelationships among different group struc- 
tures and different task characteristics, as 
these interrelationships determined the ‘“qual- 
ity” and “acceptance” of the group product 
(6). The groups varied in the degree of simi- 
larity or homogeneity of their members’ per- 
sonalities. The effects of these variations were 
examined in terms of the quality and accept- 
ance of solutions to two different problems. 
One problem involved quality alone; the other 
involved acceptance and quality. 


DEVELOPMENT OF HYPOTHESES 


Both Maier (5) and Duncker (1) have em- 
phasized the importance of the “direction”’ or 
initial orientation toward a problem in the 
problem-solving behaviors of individuals. 


‘ This article has been adapted from a dissertation 
submitted in partial fulfillment of the requirements for 
the degree of doctor of philosophy at the University of 
Michigan (3). The author wishes to express apprecia- 
tion to Norman R. F. Maier for his continuing interest 
and stimulating guidance in the research and in the 
preparation of this report. Thanks are also offered to 
Arnold Tannenbaum and Annette Wigod for construc- 
tive criticisms of this article. 

2 Now with Engineering Research Institute and the 
Department of Psychology, University of Michigan. 
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Maier presented evidence that a person who 
has many “directions” available, i.e., is capable 
of many restructurings of his perceptual field, 
is more likely to be a successful solver than 
the person who is inflexible and adheres to a 
single direction. 

On the assumption that “‘direction” operates 
in the same way in groups as in individuals, the 
variety of possible directions available from 
the members of Nonhomogeneous groups, de- 
riving from their varied perceptual and cogni- 
tive structures, should yield higher quality 
solutions in these groups than in Homogeneous 
groups. The higher quality solutions should 
result from the many different ideas and from 
the emphasis placed on different aspects of a 
problem by the members of Nonhomogeneous 
groups. This conclusion should be true only 
for those tasks requiring multiple perceptions 
and cognitive reorganizations for their solu- 
tion. Hypothesis I states therefore: On a 
problem involving quality alone, Nonhomoge- 
neous groups produce higher quality solutions 
than do Homogeneous groups. 

The nature of Nonhomogeneous groups im- 
plies differences in affective structures—in 
the innerpersonal regions—as well as in cogni- 
tive structures among the group members. 
When a problem has no objectively good solu- 
tion—i.e., when the quality of the solution can 
be judged only in terms of the members’ per- 
sonal values and standards—then differences 
in affective structures in Nonhomogeneous 
groups should again produce conflict. The 
result of such conflict is difficult to predict. If 
the values involved are not central to the mem- 
bers of the group there is a high probability 
of affective restructuring and easy resolution 
of the conflict. The group is likely to gen- 
erate a unique and interesting solution. If the 
issues are more central, affective restructur- 
ing is less likely to occur, and the Nonhomoge- 
neous group is more likely to fail to agree on a 
solution. The status quo situation will con- 
tinue. Problems requiring affective restructur- 
ing are considered to be acceptance problems. 
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To the extent that a problem involves both 
quality and acceptance, successful solutions 
require changes in both cognitive and affective 
structures of the group members. Hypothesis 
II therefore states: Cn a problem involving 
quality and acceptance, Nonhomogeneous 
groups either fail to agree on a solution and 
resist change, or produce inventive solutions. 
Homogeneous groups produce few inventive 
solutions, but either fail to agree or accept an 
alternative solution offered to them. 

Hypothesis III, concerning satisfaction with 
the solutions, also derives from the preceding 
considerations. By assumption, a solution is 
acceptable to a person to the extent that it 
satisfies his needs in the situation; in a problem- 
solving situation a good solution satisfies cer- 
tain of his needs. If he then perceives that the 
best suggestions made by the members are in- 
corporated in the group’s solution, he should 
be satisfied with the solution. In the Homo- 
geneous groups, approval should be almost 
unanimous. The solutions, although incorpo- 
rating only a limited number of ideas, repre- 
sent the opinions of all the group members, 
since these should tend to be similar. In Non- 
homogeneous groups, however, unanimous 
satisfaction should be found only in those 
groups with high quality solutions, where the 
multitude of ideas produced have been used in 
the solution. On this basis, Hypothesis III 
states: For Homogeneous groups, there is no 
relationship between the degree of satisfaction 
with the solutions and their quality for either 
type of problem. For Nonhomogeneous groups, 
there is a positive relation between the degree 
of satisfaction with and the quality of the 
solutions. 


METHOD 


The experimental procedure was conducted twice, 
once in the spring semester and again in the fall se- 
mester of 1955. The subjects (Ss) were sophomore, 
junior, and senior students in the undergraduate 
course, psychology of human relations. The 175 stu- 
dents were assigned to seven laboratory sections of 
about 25 students each. In laboratory sessions, the 
students participated in a case discussion or role- 
playing situation each week to learn skills in inter- 
viewing and in group leadership. 


Personality Measure 


The Guilford-Zimmerman Temperament : 
(GZTS) (2) administered as the personality 
measure at the beginning of each semester to all stu- 


Survey 
was 


dents. Although the GZTS was developed to measure 
personality traits and not tension systems in people, 
it was selected as the personality measure for its re- 
liability and for the relative independence of its di 
mensions. It was assumed that the ten traits it 
measures provide a sample of the personality charac- 
teristics such that Ss with similar profiles are likely to 
be more similar in personality than those with dis- 
similar profiles. Kendall’s tau was used to determine 
the correlation between the ten-score profiles of every 
pair of students in each laboratory section. Approxi- 
mately 2,000 correlations were computed each semester 
for the seven laboratory sections. With the aid of the 
Michigan Digital Automatic Computer (MIDAC), 
these computations were made in sufficient time to 
establish the groups by the following week’s meeting.’ 

The reliability of the profile correlations was com- 
puted by a split-half technique on a sample of 50 Ss. 
Corrected Spearman rank-order correlation coefficients 
between odd and even halves of the GZTS had a 
median of .77. Thirty-seven of the 50 coefficients 
equaled or exceeded .64, the value needed for signifi- 
cance at the .05 level of confidence. 


Experimental Groups 


Three types of four-person groups were assembled 
Type 1 groups (Homogeneous) 
persons with high positive taus; Type 2 groups were 
composed of persons with both high negative and 
high positive profile correlations; and Type 3 groups 
consisted of persons whose taus were approximately 
.00. The number of groups of each type initially es- 
tablished were, respectively, 20, 15, and 22 for a total 
of 57 groups. A comparison of the algebraic and absolute 
sums of tau for the three types of groups provided 
statistical evidence that there were distinct qualitative 
differences in the personality mixtures of each group 
type. The Homogeneous groups, moreover, represented 
such different dominant personality patterns that a 
subsequent aitempt to analyze possible differences 
among groups of this type had to be abandoned for 
lack of a sufficient number of groups of any one per- 
sonality type. Although students were assigned to 
these different types of groups purely on the basis of 
their personality characteristics, comparisons of the 
various types of groups also indicated no differences in 
the sex composition (the mixture of males and females) 
and in the mean final course grade 


were composed of 


Dependent Variables 


The solutions to two problems and measures of 
satisfaction with these solutions served as the de- 
pendent measures. The first of these problems—one 


3 The author thanks his wife, Roslyn B. Hoffman, 
for programming the computation of tau for the 
MIDAC and John W. Carr, III and Cecil Craig repre- 
senting the Rackham School of Graduate Studies for 
allowing me to use the MIDAC and for providing funds 
to pay for the computer time required. Without this 
assistance the magnitude of the computations involved 
would have made this study impossible to do within 
the limited time available. 
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involving quality alone—was the Mined Road Problem 
(MRP).* It was administered in both semesters in the 
next to the last meeting of the semester. 

On this problem, the group is told that it is a five- 
man guerilla team which has just blown up ap enemy 
bridge and is attempting to return to its own lines. 
In order to meet a truck that is to carry the group 
back to the base, the men must cross a road that is 
known to be heavily mined. Scattered around the 
area are some scrap materials, e.g., ropes, lumber, 
etc. that could be used to cross the road. The problem 
is to determine the best method for crossing the road 
safely, quickly, and without traces of crossing. 

Solutions to the problem were assigned a numerical 
score by a content analysis scheme developed by Lorge 
and his associates (4). A solution accumulated points 
according to the relative feasibility of the method used 
to cross the road, the relative safety, the concealment 
of clues, and the time taken to solve the problem. The 
reliability of the scoring system was computed by a 
repeat scoring of a sample of 19 group solutions two 
weeks after the initial scoring. The correlation between 
these two sets of scores was .84, high enough for con- 
fidence in the results obtained from the use of the 
scoring system. 

The second problem was the Change of Work Pro- 
cedure (CWP) problem developed by Maicr for use in 
training supervisors in human relations practices (6, 
p. 54). The problem was considered primarily an 
affective problem with a high “acceptance” component 
(6). That is, although solutions to the problem can be 
classified into different types, there is no objectively 
“‘best”’ solution; the best solution is that most accept- 
able to the group, the one they would be willing to 
carry out. Because certain solutions can be considered 
to be inventive, however, the problem also has a quality 
component 

The problem is a role-playing situation in which 
three workers, who perform three different jobs in 
hourly rotation with each other, report to a supervisor, 
the fourth man in the group. The supervisor, at the 
suggestion of a time-study man, requests the group to 
work fixed positions instead of rotating their jobs, to 
achieve more efficient production. Solutions to the 
problem resolve the conflict between attempts to 
capitalize on individual ability for higher production 
(supervisor’s suggestion) and the freedom from monot 
ony resulting from rotation. Three types of solutions 
are usually obtained: Old solutions: the group refuses 
to change or continues rotation with a different time 
interval between changes; New solutions: complete 
acceptance of the supervisor’s suggestion or acceptance 
with some minor modifications like rest pauses or 
music; and Compromise or Inventive solutions: work 
procedures, not included in the roles themselves, that 
attempt to gain the advantages of both individual 
ability and freedom from monotony. Assignment of 
the solutions obtained in this study to these categories 
offered no difficulty. A reclassification of the solutions 


‘TI wish to thank Irving Lorge of Columbia Univer- 
sity for supplying me with copies of the Mined Road 
Problem and the scoring key and for discussing the 
interpretation of the scoring procedure with me. 


after a two-week interval showed almost perfect agree- 
ment with the initial classification. The scorings of 
both MRP and CWP were done without knowledge of 
what type of group had submitted the solution. 

For both problems, measures of the satisfaction 
with the solution reached by the group were collected. 
For the MRP, the question took the form, “Were you 
satisfied with: (a) the entire solution, . . . (d) practically 
none of the solution?” For the CWP, students were 
asked, “How satisfied are you with the solution reached 
by the group? (a) Very satisfied,... (e) Very dis- 
satisfied.” 

The problem-solving activity, the collection of 
solutions, and the expressions of satisfaction with the 
solutions were part of the usual classroom procedure. 
The experimental procedure introduced written solu- 
tions and written expressions of satisfaction instead of 
the usual oral collection of these data. 


RESULTS 
The results of the problem-solving sessions 
will be discussed first in terms of the quality 
and the kinds of solutions obtained in the two 
groups, and second in terms of the acceptance 
of the solutions by the groups. 


Quality and Kind of Solutions 


When the MRP was administered, only 
four Type 2 groups were wholly intact. These 
four were combined with the 13 intact Type 3 
groups to form 17 Nonhomogeneous groups 
for the analysis. To test Hypothesis I, the 
scores for these groups have been compared 
with the scores of the 13 intact Type I (Homo- 
geneous) groups in Table 1.° The higher the 
score the better is the solution. The significant 
18.6 point superiority of the mean of the Non- 
homogeneous groups over that of the Homoge- 
neous groups supports Hypothesis I. The 
superiority of the Nonhomogeneous groups is 
even more apparent when the two distribu- 


5 The data for both problems are based on the prod- 
ucts of only those groups which were wholly intact 
(i.e., all four members were present) at the time the 
problem was administered to the class. The use of 
only intact groups for the analysis attempted to avoid 
bias due to differences in group size, and to any non 
random forces that might be operating to cause a 
person to be missing (e.g., rejection by or of the group) 
and to make the intact parts of these groups different 
from the wholly intact groups. Sixty-five per cent and 
60% of the Homogeneous groups were intact for the 
MRP and the CWP respectively. Forty-six per cent 
of the combined Nonhomogeneous groups were present 
for both problems. This percentage difference is not 
statistically significant. A significantly smaller per- 
centage (27%) of the Type 2 than Type 1 groups was 
intact for the MRP. 
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TABLE 1 
COMPARISON OF SCORES ON MINED ROAD 
PROBLEM 


Number 
of 
Group Type Tau Mean* SD _ Groups 
Homogeneous + 44.5 28.63 13 
63.1 28.53 17 


Nonhomogeneous = 


* One-tailed ¢ test that the mean of Nonhomogeneous groups 
is greater than the mean of the Homogeneous groups is significant 
at the .05 level 


TABLE 2 
NuMBER OF Groups Propucinc EAcH 
Type OF SOLUTION 


Type of 
Solution 
Group Type Tau N scinaeemmiaiatin . 
: Inven- 
Old | New thee 
Homogeneous + 12 4 6 2 
Nonhomogeneous 0, - 17 2; 8 7 


tions are compared. When the results of all the 
groups were combined, only 3 (23%) of the 
13 Homogeneous groups exceeded the median 
score for the total distribution, whereas 12 
(71%) of the seventeen Nonhomogeneous 
groups surpassed that score. 

Although the groups were assembled purely 
on the basis of the degree of homogeneity of 
the members’ personalities, the question may 
be raised as to whether the obtained differ- 
ences can be safely attributed to the person- 
ality differences alone. Three other variables 
which could have accounted for the differences 
were examined: (a) intelligence and knowl- 
edge of the subject matter of the course, as 
measured by the final course grades, (6) the 
sex composition of the group—the number of 
women in the group, and (c) the sociometric 
attractiveness of the group. No significant 
differences were found to distinguish the 
groups, thus reinforcing the causal link be- 
tween the personality composition of the 
groups and their problem-solving qualities. 

The results for the CWP were less clear-cut. 
The distributions of solutions for the Homoge- 
neous and Nonhomogeneous groups are pre- 
sented in Table 2. The chi-square value for the 
contingency table comparing the Homogene- 
ous and Nonhomogeneous groups is 2.88, 
which is not significant at the .05 level of con- 
fidence. Hypothesis II was therefore not con- 
firmed. Both Homogeneous and Nonhomogene- 


ous groups tended to accept the supervisor’s 
suggested work method, the New solution. 

Considering Inventive solutions alone, how- 
ever, only 2 (16%) of the 12 Homogeneous 
groups as against 7 (41%) of the 17 Non- 
homogeneous groups produced such solutions. 
Although the percentage difference is not sta- 
tistically significant, it suggests that the reso- 
lution of conflict generated in certain Non- 
homogeneous groups resulted in ideational 
creativity. 


Acceptance of Solutions 


The data reflect a high degree of satisfaction 
with the solutions to the MRP in both types 
of groups, despite the fact that the solutions to 
the problem produced by the Homogeneous 
groups were so inferior by the objective cri- 
terion. In 21 of the 30 groups, all four people 
agreed with the final group solution. The 
tendency toward group acceptance of the 
solution was especially marked in the Homo- 
geneous groups, where the mean number satis- 
fied was 3.8 out of a possible 4.0, and where 
only 2 of the 13 groups were not completely 
in accord on the final solution. In the Non- 
homogeneous groups, the mean was 3.3 and 7 
of the 17 groups were not unanimously satis- 
fied. 

The data support the first part of Hypothe- 
sis III, which predicted no relationship be- 
tween the quality of the solution and the de- 
gree of satisfaction with the solution in the 
Homogeneous groups. Where there is no 
variance there can be no relation. Almost every 
Homogeneous group was satisfied with the 
eminently poor solutions they achieved. 

However, in the Nonhomogeneous groups, 
satisfaction was not related to the quality of 
the solution either. When the distribution of 
scores on the MRP was dichotomized at the 
median, and the Nonhomogeneous groups 
whose members were completely satisfied were 
compared with those where satisfaction was 
less than unanimous, no relationship was 
found between the quality of and satisfaction 
with the solution. The expectation that mem- 
bers of Nonhomogeneous groups would be less 
satisfied with a poor solution than with a good 
one was not confirmed by the data. 

Hypothesis III was not confirmed by the 
data for the CWP either. The relationships 
within each group type, although in the posi- 
tive direction, are not statistically different 
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TABLE 3 
RELATION BETWEEN TYPE OF SOLUTION 
AND SATISFACTION WITH THE 
SOLUTION 
(Change of work procedure) 


Type of Solution 


Old or 
Group Type New Inventive 
Homogeneous 
Number satisfied 
Four 3 2 
Less than four 7 0 
p= .15 
Nonhomogeneous 
Number satisfied 
Four 6 6 
Less than four 4 1 
p= .24 
Total 
Number satisfied 
Four 9 8 
Less than four 11 1 
p = .03 





from each other nor are they significantly 
different from zero. For both types of groups, 
where the group generated the Inventive solu- 
tion, the members of the group were unani- 
mously satisfied with the solution (Table 3). 
In the one group which produced an Inventive 
solution but where satisfaction was less than 
unanimous, three of the four members were 
satisfied. The relationship for the combined 
groups is significant at the .05 level. If a group 
succeeded, on this problem of quality and ac- 
ceptance, in generating a solution which en- 
compassed both problem aspects of the situa- 
tion—i.e., obtaining greater productivity 
without monotony and boredom—the group 
members accepted the solution as satisfactory. 


DISCUSSION 


The results of this study support the value 
of studying the effects of the structure of 
groups without regard to the particular char- 
acteristics of the individuals in the groups. An 
important determinant of the group’s effective- 
ness is the interaction of these individual char- 
acteristics. The degree of homogeneity of per- 
sonality of the members of the groups used in 
this study was seen to have a direct bearing on 
the effectiveness of the groups in producing 
solutions to problems. 

As defined for the present study, homoge- 
neity is a relatively pure concept, reflecting 
the degree of similarity of the personality 
types comprising the group membership, 


rather than the particular personality types of 
the individuals in the groups. The members of 
different Homogeneous groups were not inter- 
changeable, and the personality types that 
were dominant in the Homogeneous groups 
varied from group to group. The members of 
the Nonhomogeneous groups obviously varied 
considerably in the types of personalities they 
represented. The relative “pureness” of the 
concept of homogeneity makes all the more im- 
pressive the finding that the Nonhomogeneous 
groups did significantly better on the MRP 
than did the Homogeneous groups, and that 
there was a tendency for the Nonhomogeneous 
groups to produce more Inventive solutions on 
the CWP. 

The results imply that a multiplicity of per- 
ceptions of a problem are productive of crea- 
tive solutions. In the present trend towards 
group research in scientific organizations and 
toward group decision-making in various ad- 
ministrative circles, the abilities of each indi- 
vidual may be less important than the peculiar 
composition of backgrounds and experiences 
represented by the various members of the 
team. Pelz, in a study of a scientific research 
organization (7), found that frequent contact 
between a scientist and other members of his 
research group was related to the scientist’s 
own productivity only in those situations 
where the other members of the group were 
dissimilar from him in their work motivations 
and previous work experience. Pelz’s findings 
suggest that the results reported in the present 
study are probably generalizable well beyond 
the limited population of college students who 
supplied the data. 

The findings with respect to satisfaction of 
the group members with the solution to prob- 
lems also bear comment at this point. On the 
MRP there was a high degree of satisfaction 
with the solutions obtained in aimost every 
group regardless of type. Even in those Non- 
homogeneous groups in which unanimous 
satisfaction was not achieved, only one or two 
members in each group were not completely 
satisfied with the solution. On the CWP, how- 
ever, the only groups which showed a similar 
unanimous satisfaction with the solution were 
those which produced Inventive solutions. 
These two sets of results suggest the possible 
fruitfulness of a distinction between decision- 
making, in which the group members choose 
among one of several alternatives, and prob- 
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lem-solving, in which the members work 
through a problem and create a solution. The 
majority of solutions to the CWP were Old or 
New solutions, the results of choosing between 
alternative solutions already given in the 
problem situation. The creation of Inventive 
solutions to the CWP and of all solutions to 
the MRP required the groups to combine 
elements and develop new procedures. For the 
college student group in this study, the creative 
process was a more satisfying experience than 
the decision-making one. 

A final important consideration emphasized 
by the results of this study is the lack of ade- 
quate classification of problem-solving tasks in 
the psychological literature. An attempt was 
made to classify the two problems in this study 
on an a priori basis according to the cognitive 
or quality aspects versus the affective or ac- 
ceptance aspects of the problem. The MRP 
was considered to be purely a quality problem 
requiring the group members to reorganize 
their cognitive fields in a way which would 
provide solutions to the problem. The CWP 
was considered to be a problem involving both 
quality and acceptance. The hypotheses con- 
cerning differences between the performances 
of Homogeneous and Nonhomogeneous groups 
rested on the assumed validity of these classifi- 
cations. The lack of support for the hypothesis 
concerning solutions to the CWP (Hypothesis 
IT) may have been a refutation of the theory or 
possibly only of the validity of the problem 
description. 

Ray (8) recently performed a service for 
psychologists by bringing together the variety 
of problems frequently used in research on the 
problem-solving process. The lack of any ap- 
parently common elements in these tasks 
points up the problem of predicting group 
performance knowing only the nature of the 
group. The interrelationships among group and 
task characteristics are fundamental, and the 
study of group problem-solving cannot afford 
to ignore them. 


SUMMARY 


The study investigated the relationships be- 
tween the characteristics of groups and of 
problem-solving tasks, and the effects of these 
relationships on the groups’ problem-solving 
performances. The performances of groups 
varying in the homogeneity of their members’ 
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personalities were compared on a problem in- 

volving quality alone, and on another involv- 

ing acceptance and quality. 

On the basis of correlations between pairs of 
individual score profiles on the Guilford- 
Zimmerman Temperament Survey, groups of 
four students each were established at the 
beginning of two semesters in a course in 
human relations. Homogeneous groups, in 
which the members had high positive profile 
correlations, and Nonhomogeneous groups, in 
which the profile correlations were zero or 
negative, were maintained throughout each of 
the semesters. 

The Nonhomogeneous groups produced sig- 
nificantly superior solutions to the quality 
problem, and showed a tendency to produce 
more inventive solutions to the problem in- 
volving quality and acceptance. No significant 
differences were found in the degree of satis- 
faction with solutions to either of the problems. 
On the problem of quality and acceptance, a 
significant correlation was found in both types 
of groups between the members’ satisfaction 
with the solution to the problem and the 
quality of the solution. 
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HE concept that unacceptable ag- 

gressive or hostile impulses may be 

“displaced” to targets more suitable 
than the original one has been with us in 
psychology at least since the writings of 
Sigmund Freud (8). However, it is primarily 
as a result of the explicit formulation of 
frustration-aggression theory (7) that con- 
certed experimental test of this proposition 
has been attempted in diverse areas. One 
specific formulation derived from _ these 
conceptualizations is that increasing persona! 
frustration may have, as one consequence, 
an increase in expression of prejudice. Such a 
theoretical notion has been referred to as a 
“scapegoat” theory of prejudice (23). A more 
detailed consideration of possible relations 
between frustration of personal needs and 
prejudice has been presented by Krech and 
Crutchfield (13) under the heading of “a 
motivational analysis of prejudice.” 

Criticism has been directed to a scapegoat 
theory of prejudice both on theoretical and 
empirical grounds. In the former instance, 
the argument has been advanced that a scape- 
goat theory is an insufficient basis for ex- 
plaining a sizeable number of instances of 
prejudice (23). As stated, there can be little 
question as to the justifiability of this argu- 
ment. On the other hand, a scapegoat theory 
of prejudice may quite appropriately be 
viewed as no more than one of a series of 
explanatory principles required for complete 
understanding of the phenomena of prejudice. 
Gordon Allport, in his scholarly treatment of 

! Portions of the present paper were presented to 
Division 8 at the annual meetings of the American 
Psychological Association in New York, September, 
1957. 

2 The authors wish to express their appreciation to 
Russel F. Green for his contributions to the final 
method of analysis. 

3 The study was carried out while both junior authors 
were at the University of Rochester. 
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this problem (2), takes exactly such a position. 
Allport reviews six major classes of theoretical 
explanations of prejudice and points out that 
each seems to constitute a constructive vehicle 
for augmentation of our understanding of the 
phenomenon. Allport states “...as a rule 
most ‘theories’ are advanced by their authors 
to call attention to some one important 
causal factor, without implying that no other 
causal factors are operating” (2, p. 207). 
It may therefore be important to re-emphasize 
that when we are dealing with complex social 
processes such as, for example, prejudice, 
delinquency, industrial conflict, and _ inter- 
national tensions, multiple determinants are 
likely to be involved. The identification of a 
single determinant does not in any way 
positively demonstrate that this is a sole 
determinant; nor does it necessarily preclude 
the operation of differing determinants toward 
the same end result. 

An examination of some empirical data 
bearing on a scapegoat type theory indicates 
fairly conclusively that such an explanation 
should indeed be considered partial. For 
example, Morse and Allport (17), in a com- 
prehensive investigation of seven hypotheses 
about the causes of anti-Semitism, found that 
only the factor of “national involvement” 
co-varied uniquely with anti-Semitism. ‘“Cir- 
cumstance frustration,” the factor most 
directly derivable from a scapegoat theory, 
related only modestly to discriminatory 
treatment of Jews, leading the authors to 
conclude that scapegoat theories may not be 
taken as “general explanations of anti-Semi- 
tism.” 

Lindzey (15), in partial support of a scape- 
goat explanation, reported that both high 
and low prejudice Ss increased significantly 
in displaced aggression following frustration. 
On the other hand, since the high prejudice 
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Ss (centrary to deductions) failed to displace 
than the author 
comprehensive 


more aggression the lows, 


rejects scapegoating as a 
explanatory principle. 
Studies offering less qualified support for 
the existence of the scapegoating phenomenon 
the literature. Thus 
(3), in their classic 


are also reported in 
Allport and Kramer 
investigation of the “‘roots of prejudice,” ob- 
serve that among their Ss (Harvard, Dart- 
mouth, and Radcliffe undergraduates) Catho- 
lic and Jewish Ss who saw themselves as more 
victimized also tended to be more prejudiced 
toward other minority groups. These findings, 
within a frustration aggression 
framework, were subsequently replicated by 
Rosenblith (20) with South Dakota under- 
graduates. Gough (9) found that high anti- 
Semite Ss are “less able to overlook and ignore 
minor irritations and frustrations.’””’ Mussen 
(18) reports that high prejudice children had 
stronger aggressive and dominant needs 
that did low prejudice Ss, and that they also 
increase in prejudiced feelings 
toward Negroes, in contrast to who 
showed a decrease, following four weeks in ana 
interracial summer camp. Finally, Bettleheim 
and Janowitz (4) have demonstrated signifi- 
cant contingencies between downward social 
mobility of veterans and intensity of anti- 
Semitic and anti-Negro attitudes. 

With regard to the empirical data thus far 
reviewed, we may tentatively conclude that a 


interpreted 


showed an 
lows 


scapegoat concept provides a basis for under- 
standing some instances of prejudice, but is 
insufficient as a general explanatory principle. 
Two additional 15), each 
of which constitutes a direct test of the scape- 


investigations (5, 


goat proposition remain to be considered. 
Miller and Bugelski (16), working in the 
context of a CCC camp, were able to show a 
drop in positive attitudes and 
some trend toward increasing negative atti- 


significant 


tudes toward Mexicans and Japanese following 
a rather realistic 
frustration. On the basis of these findings the 
that 


aggression, which was in turn displaced in the 


experimental induction of 


authors concluded frustration increased 
form of deterioration of attitude to minority 
group members. 

Recently, however, on the basis of some 


experimental work by Congdon (5), as re- 


ported by Stagner and Congdon (22), some 


question has been raised with respect to the 
generality, if not substance, of the Miller and 
Bugelski findings (15). Congdon assessed 


attitudes toward various in-groups and out- 
groups using a series of modified Osgood 


Following 
were 


semantic-differential-type scales. 
this, experimental Ss in two groups 
either mildly or strongly frustrated by failure 
on two of four or four of four subtests of the 
Grace-Arthur, respectively. A control group 
received no frustration. Subsequent read- 
ministration of the attitude scales indicated 
no differences in attitude change scores among 
the three groups. On the basis of these data, 
Congdon challenges the defensibility of a 
scapegoat theory of prejudice. He goes on to 
speculate that the failure to support the Miller 
and Bugelski findings may reflect some com- 
bination of: (a) having used a less arbitrary 
type of frustration (e.g., see Pastore (19), 
(6) having provided outlets for self-punitive 
behavior which were presumed not to have 
been present in the Miller and Bugelski 
experiment, and (c) the higher intellectual 
level of his subjects. 

If on theoretical grounds one espouses, as 
we have, the view that a scapegoat theory of 
prejudice may be most useful as one of a series 
of complementary explanatory principles 
underlying the complex social phenomenon 
of prejudice, the Congdon findings raise the 
question as to whether such a view is useful 
even as a single particularist explanation for 
understanding some manifestations of preju- 
dice. It is to this latter specific issue that the 
present research is addressed. In essence, we 
have attempted to re-examine the proposition 
that frustration will lead to an increased 
verbal expression of prejudice, preserving in 
our design the features of nonarbitrariness, 
opportunity for expressive and self punitive 
behavior, and high intelligence level of Ss, 
to which Congdon has attributed his negative 
findings. 


METHOD 


Instruments 


Two comparable subscales, each presumably meas 
uring authoritarian attitudes and minority group 
prejudice, were drawn from a larger pool of items uti 
lized in the California studies (1). These included the 
30-item F scale (combined Forms 40 and 45), the 12 
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item Anti-Negro (AN) scale, and eight items each from 
the Anti-Minority (AM)* and Patriotism (P) scales of 
the larger Ethnocentrism (E) scale. Items were as- 
signed so that the subtests would be equated with re- 
spect to item discrimination quotients as reported in 
The Authoritarian Personality (1). The final forms (X 
and Q) each contained 29 items as follows: (a) F scale 
15 items, (6) AN scale—6 items, (c) AM scale—4 items, 
d) P scale—4 items. 


Subjects and Procedure 


Subjects (Ss) were 32 male and 32 female intro 
ductory psychology volunteers, all of whom were 
tested individually. The actual experiment consisted of 
three phases: the attitude pretest, frustration, and the 
attitude posttest. In the pretest phase, Ss were given 
either Form X or Form Q of the attitude scale, with 
order of presentation counterbalanced for both sexes 
Immediately following completion of this first attitude 
scale, Ss were informed that we wished to collect some 
additional and separate data bearing on the problem 
solving habits of college students. It was in this con- 
text that frustration was introduced 

In order to induce frustration, all Ss were given two 
puzzles which, though appearing soluble, were func- 
tionally nonsoluble in the time allotted. The actual 
puzzles used were the nine-dot problem reported else- 
where by Cowen (6), and one of the Katona match 
stick problems (12). Fictitious time norms were given, 
so as to increase the likelihood that frustration would 
occur. The actual time allotted by E fell far short of 
what would be needed by most people to solve the 
problems.® The attitude of the Z during administration 
of the frustrating puzzles might best be described as 
aloof, nonsupporting, and disbelieving of Ss inability 
to achieve a correct solution.* Upon the Ss failure to 
solve the second puzzle E stated simply, “I’m afraid 
our time is up for this problem too. We will have to 
complete the second part of the attitude scale now.” 
The S then completed the alternate form of the com- 
bined attitude scale 


RESULTS 
Table 1 summarizes mean “‘pre” and “post” 
frustration test scores for each of the attitude 
test subscales. The data are presented sepa- 
rately by sex and for the two orders of ad- 
ministration. 


* Several items which were considered inappropriate 
either because of temporal or geographic factors (e.g., 
“zoot-suiters” and “Filipinos’’) were either deleted or 
slightly modified in wording 

* Five Ss who got correct solutions to one of the 
puzzles, were dropped from the study and replaced by 
new Ss 

§ It was the impression of the Zs that failure to solve 
the puzzle did constitute a frustrating experience for 
the great majority of Ss in terms of such behavioral 
manifestations as increased fidgetiness or blushing, 
and verbal comments of discomfort, self-depreciation 
and/or hostility to the Es 


TABLE 1 
MEAN “PRE” anv “Post” ScoREs 
For ALL SUBSCALES 


Order 
Scale Sex - 
Xpre Qpost Qpre X post 
AN M 11.2 15.6 10.4 10.6 
F 9.9 11.3 11.3 10.8 
F M 49.7 44.9 44.0 55.2 
F 47.4 42.6 40.9 46.3 
AM M 8.6 9.4 10.3 9.4 
F 7.8 8.8 7.9 8.1 
P M 13.6 11.7 13.4 13.5 
F 12.4 11.4 12.1 12.9 


TABLE 2 
ANALYSIS OF VARIANCE FOR ANTI-NEGRO SCALE 


di Sum of Mean 


Source : 
ource Squares | Square 


P 


Between Ss 


63 | 1942.5 | 
1 39.1 | 39.1 1.33 | ns 
Ax€ 1 48.5 | 48.5/| 1.64] ns 
AXBxXC 1 86.4 | 86.4 2.93 | ns 
Error (b) 60 | 1768.5 | 29.5 
Within Ss 64 345.5 
A 1 73.3 | 73.3 | 25.28 | .001 
. 1 58.0 | 58.0 | 20.00 | .001 
AXB 1 9.8; 9.8) 3.38) ns. 
BXC 1 30.2 | 30.2 | 10.41 | .01 
Error (w) 60 174.2 2.9 | 
Total 127 | 2228.0 


Note.—A = Test form; B = Sex; and C = Pre- 
Post frustration. 


For each of the constituent subscales, a 
three-way analysis of variance was carried 
out, involving the main effects of test form 
(A), sex (B), and pre-post frustration (C). 
Since both test form and pre—-post frustration 
(the effect in which our major interest cen- 
tered) are ‘“‘within subjects” effects, a Lindquist 
Type IV design (14) was employed as the 
basic model for the analysis. In general, the 
only subscale that presented consistently 
positive findings was the AN scale. 


AN Scale 


Table 2 presents the results of a three-way 
analysis of variance for the Anti-Negro scale. 
The most salient findings in this table are the 
significant F ratios involving pre—post frustra- 
tion (C). Although the main effect here is 
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highly significant, suggesting, in support of 
our hypothesis, the presence of stronger anti- 
Negro feelings following frustration, this 
finding may be pinpointed somewhat more 
specifically by noting other significant main 
effects and interactions, together with the 
means presented in Table 1. Thus it appears 
that there may be differences in the two 
supposedly equated subtests, with higher AN 
scores being given on Form Q. Perhaps more 
germane is the significant B X C interaction, 
indicating that male Ss express significantly 
stronger anti-Negro attitudes after frustration 
than do female Ss. 


Incidental Findings 

For the P scale, no significant main effects 
or interactions are observed. The pattern of 
findings for the AM and F scales is highly 
similar. In each case there are significant 
differences in the two test forms, a finding 
entirely tangential to our present focus, and 
on both scales there is either a significant 
(AM) or near significant (F) main effect of 
sex. Male Ss tend to score consistently higher 
(more negative attitudes) on both of these 
scales. This difference, however, is a general 
one, which does not vary systematically for 
“pre” vs. “post” frustration. 

In order to test the generality of responses 
to the various subtests, a series of 12 Pearson 
product-moment correlations, in which AN 
scores were related to each of the other three 
scale scores (by sex, for both the “pre” and 
“post”’ tests), were computed. The corre- 
lations ranged in magnitude from .27 to .58 
and averaged .40. These correlations are 
substantially lower than the ones reported 
by the California group (1). 


DISCUSSION 


The most notable finding in the present 


experiment is the significant increase in anti- 


Negro feelings following experimental in- 
duction of frustration. Such a datum offers 
additional support for the existence of the 
scapegoat phenomenon and is quite consistent 
with earlier findings of Miller and Bugelski 
(15). Of incidental interest is the consistent 
trend observed on three of the subscales for 
male Ss to show greater prejudice (as well as 


greater increase in prejudice following frustra- 
tion on the AN scale) than do females. This 
finding too is in line with empirical evidence 
(3, 20) and theoretical expectations (21) 
discussed elsewhere. 

There remains a sharp contrast between our 
basic findings in support of a scapegoat theory 
and those of Congdon (5) which fail to support 
this view. The latter has proposed that his 
failure to obtain significant postfrustration 
effects may reflect some combination of less 
severe and less arbitrary frustration, provision 
of opportunities for self-punitive behavior, 
and the higher intellectual level of his Ss. In 
the present study, frustration was neither 
severe nor arbitrary or, at least insofar as can 
be judged, no more so than Congdon’s. Oppor- 
tunities for self-punitive behaviors should have 
been roughly comparable, as was the intellec- 
tual level of the Ss. These factors rotwith- 
standing, it was possible to demonstrate the 
operation of the scapegoat effect in the present 
study. 

The source of the discrepant findings in these 
two ostensibly comparable studies cannot be 
identified with any confidence, but several 
procedural variations may be noted that might 
have obscured the scapegoat effect in the 
Congdon experiment. The attitude dimensions 
used for the ratings tended,to have quite highly 
crystallized social desirability values (e.g., 
kind—cruel, strong—weak, etc.). Both pre- and 
postscores may thus have been pushed toward 
the socially desirable response, obscuring dif- 
ferences. Then, too, the use of identical items 
in both pre- and posttest (in contrast to the 
alternate forms of the present study) may have 
operated to help sophisticated Ss sense the 
purpose of the experiment. Finally, Congdon 
used speed instructions, which may possibly 
have impaired the reliability of the attitude 
scales. 

Possibly the most interesting issue raised by 
our findings is the fact that significantly higher 
anti-Negro attitudes are present following 
frustration, in the absence of a parallel increase 
in F, AM, and P scale scores. In the basic 
development of these scales (1), substantially 
high intercorrelations have been reported. 
Whether such correlations reflect a true clus- 
tering of these classes of attitudes, or a per- 
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vasiveness of response tendency behavior, the 
fact remains that there was reason to anticipate 
that they would have “behaved” similarly in 
our study. However, they did not. Our own 
pre- and postscale intercorrelations run sub- 
stantially lower than those originally reported, 
Suggesting that our Ss responded with a 
degree of independence to the subscales. The 
positive results on the AN scale can be seen 
most defensibly as an_ illustration of 
“targeting” a specific minority group (2, 10, 
23). That the Negro is the targeted group in 
the present study may be a manifestation of a 
tendency noted earlier by Horowitz (11) for 
this group to constitute a preferred target in 
this geographic locale. In another vein, 
Bettleheim and Janowitz (3) observe that 
thresholds for anti-Negro prejudice may be 
lower than those for other minority groups to 
the point where negative attitudes may break 
through despite the presence of “relatively 
adequate controls.” In agreement with such an 
interpretation is our observation that anti- 
Negro feelings seem to constitute a preferred 
prejudice in informal conversations of under- 
graduates at this institution. 

In general overview, the present findings are 
viewed as confirming the hypothesis that frus- 
tration augments the expression of prejudice. 
The major limitation placed upon this conclu- 
sion is that the consequent increase in preju- 
dice may be specific rather than generalized. 
Our findings in no way limit the role or impor- 
tance of other types of antecedents of preju- 
dice. Undoubtedly the relationships between 
many such antecedents and the same final 
product will have to be identified if we are 
ultimately to have an adequate, comprehen- 
sive theory of prejudice. For the present, how- 
ever, the usefulness of a scapegoat theory, at 
least as one of a series of complementary ex- 
planatory principles, appears defensible. 


SUMMARY 


The present study was designed to test the 
proposition that frustration may increase the 
expression of prejudice. Sixty-four Ss were 
given a series of attitude scales, following which 
all were exposed to a relatively mild, experi- 
induced frustration. 


mentally Immediately 


thereafter, alternate forms of the attitude scale 
were administered. 

Significant increases were found in the ex- 
pression of anti-Negro prejudice following 
frustration, this effect being more pronounced 
in male Ss. Since comparable postfrustration 
effects were not observed on other subscales, 
the results were interpreted as an instance of 
“targeting” of a minority group within the 
general framework of the scapegoat phe- 
nomenon. 
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ACQUIRED AND SYMBOLIC AFFECTIVE VALUE AS DETERMINANTS 
OF SIZE ESTIMATION IN SCHIZOPHRENIC AND 
NORMAL SUBJECTS' 


THEODORE P 


ZAHN? 


Duke University 


ECENT work in psychopathology has 
shown that differences between schizo- 
phrenics and normals, with respect to 

some kinds of performance at least, may be 
apparent primarily when stressful or threaten- 
ing elements are present in the situation (14). 
Criticism of subject’s (S’s) performance (3, 
12) and the use of stimulus materials depicting 
a censorious relationship between a mother 
and a son (9) have both been found to affect 
differentially the performance of schizophrenic 
and normal Ss in experimental tasks. 

An important variable related to reactions 
to stress and threat in schizophrenics has been 
the level of social and sexual adjustment before 
the illness as measured by the Phillips Scale 
of Premorbid Adjustment in Schizophrenia 
(16). This scale was designed to measure the 
extent to which the patient had participated 
in interpersonal relationships before his ill- 
ness. Patients with very inadequate premorbid 
adjustments (Poors) have been found to have 
less favorable prognoses than those with more 
adequate premorbid histories (Goods) (10, 16). 
These two kinds of schizophrenic patients 
have been found also to exhibit different reac- 
tions to stress in experimental situations. The 
Poors exhibited poorer performance in a dis- 
crimination task than did Goods or Normals 
only when pictorial representations of censori- 
ous mother-son relationships were involved (9). 


! This report is based on a doctoral dissertation sub- 
mitted to the Graduate School of Arts and Sciences of 
Duke University. The research was supported by 
USPHS-NIMH Project M-629. The author wishes to 
express his gratitude to Eliot H. Rodnick for advice 
and encouragement throughout all phases of the study 
and in the writing of this report. He is also indebted to 
the departments of psychology and psychiatry at 
the Roanoke VA Hospital, the Durham VA Hospital, 
and the State Hospital at Butner, N. C., for their 
cooperation in making subjects and facilities available. 
A paper based on the findings in the present report was 
read at the APA Convention in 1956, at Chicago, 
Illinois 

? Now at the Laboratory of Psychology, National 
Institute of Mental Health, Bethesda 14, Md 


A study by Bleke (3) also demonstrated that 
censure or threat is more effective in modify- 
ing the behavior of Poors than Goods, but 
that the modification may not necessarily be 
detrimental. These modifications seem to be 
in the form of behavior directed toward an 
avoidance of the censure or threat which may 
or may not interfere with effective task per- 
formance, depending on the situation (3, 4, 
17). 

Harris (13) had Good and Poor premorbid 
schizophrenics and normal Ss judge the sizes 
of a square, a “neutral” picture, and four 
pictures of a mother and son with various 
emotionally relevant contents. On the latter 
five items, the Poors overestimated and the 
Goods underestimated the sizes of these 
pictures, with the Normals falling in between. 
On the basis of Klein’s (15) notion that Ss 
who manifest flexible or weak cognitive con- 
trols over their needs tend to overestimate the 
sizes of stimuli as compared to Ss whose con- 
trol mechanisms are stronger or “constrictive,” 
these data suggest that the two premorbid 
groups may differ in the degree of such con- 
trol. Klein’s data also show that the presence 
of a strong need will exaggerate the dominant 
size estimation tendency, although the evi- 
dence for this in the Klein study is convincing 
only for the flexible control Ss. Other studies 
on size estimation indicate that overestimation 
may be a function of either positive or nega- 
tive value or “personal relevance” (5, 6, 8). 
Under this perceptual accentuation hypothesis, 
Harris’ data may be explained by assuming 
that all the pictures he used had more personal 
relevance for the Poors than for the other two 
groups. 

According to both the Klein hypothesis and 
the perceptual accentuation hypothesis, then, 
one would expect changes in size estimation as 
a function of the affective value of the stimu- 
lus being estimated. These theoretical and 
empirical considerations indicate that the size 
estimation technique may have promise as a 
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means of investigating the kinds of variables 
which have special affective significance for 
schizophrenic patients and for elucidating the 
mechanisms underlying their behavior in 
situations possessing such significance. 

It was thought that affective value might be 
“built into” neutral pictures by a pseudo- 
conditioning technique involving differential 
reinforcement. This was attempted by prior 
presentation of the two to-be-judged pictures 
in a discrimination task mot involving differ- 
ential size. S’s responses to one of the pictures 
were almost consistently rewarded (he was 
informed that he was “right”’). His responses 
to the other picture were just as consistently 
punished (informed “wrong’’). Size judgments 
of each picture were then made by the method 
of average error. 

It may be assumed that through the con- 
ditioning procedure, the punished picture 
acquires secondary reinforcing properties and 
produces to some extent the same effects as 
those elicited by “wrong.” This may be func- 
tionally equivalent to an increase in a drive. 
Thus, from the Klein hypothesis, one would 
expect an overestimation of the 
censured picture by the Poors and an in- 
creased underestimation of it by the Goods. 
The perceptual accentuation hypothesis (in 
conjunction with the Poors’ demonstrated 
greater sensitivity to censure) would lead 
similarly to the prediction of an increased over- 
estimation by the Poors but would predict 
smaller changes in the same direction by Goods 
and normal Ss. A preliminary study involving 
ten Poors and six Goods, in addition to con- 
firming Harris’ finding of a relative overestima- 
tion tendency by the Poors, showed that this 
group overestimated the size of the picture 
associated with punishment relative to the 
rewarded picture, and that the Goods showed 
a small tendency in the opposite direction. 
The interaction between premorbidity and 
type of reinforcement was statistically sig- 
nificant (p < .05). 

The present experiment was designed partly 
to examine this phenomenon more definitely. 
Half of each subject group judged the sizes of 
two neutral pictures after the latter had af- 
fective value built into them by the discrimi- 
nation procedure. The remaining Ss judged 
these same pictures without their having been 
so treated. This latter procedure provides a test 


increased 


of the hypothesis that the subject groups differ 
in their general size estimation tendencies to 
neutral stimuli. 

Similarly, the sizes of two mother-son pic- 
tures, one a scolding scene and one a feeding 
scene, were judged both “treated” and “‘un- 
treated.” The hypothesis was entertained that 
the symbolic representation of a censorious 
mother-son relationship produces effects simi- 
lar to those produced by the experimental 
administration of censure. Under the assump- 
tion that the Poors have been subjected most 
strongly to maternal domination (17), it was 
predicted that they overestimate the size of 
the scolding picture relative to that of the 
feeding picture to a greater extent than the 
other two groups. 

In addition, the experiment may shed some 
light on the more general problem of the 
motivational conditions which influence size 
estimation. The present procedure of adjusting 
the size of the actual stimulus to a standard 
from immediate memory is not strictly com- 
parable to the previous studies which used 
discs and had the standard present at all times, 
but it is probable that similar variables oper- 
ate in both cases. 


METHOD 
Subjects 


All Ss were white, male hospitalized veterans, less 
than 45 years of age, without gross or uncorrected 
sensory defects or organic brain damage so far as could 
be determined, and of normal intelligence. Twenty-four 
of the Ss were diagnosed as schizophrenic. Of these 
patients, 13 were rated as Goods and 11 as Poors on the 
basis of the Phillips Scale of Premorbid Adjustment in 
Schizophrenia. The range of possible scores on this 
scale is 0-30; a high numerical value indicates a poor 
premorbid adjustment and a low score a good premorbid 
adjustment. In the present study, the Poors had scores 
of 18 and above, the Goods 13 and below. Only those 
patients were used whose file contained sufficient infor- 
mation to permit a reliable rating. No patient was used 
who had had a lobotomy or who had undergone shock 
therapy for at least one month prior to testing. Most 
patients had had shock some time in the past. Although 
16 patients taking tranquilizing drugs were tested, no 
data from such Ss are included in the present report. 
Fifteen schizophrenics, 10 Poors and 5 Goods, were ex- 
cluded from the sample because of inability to perform 
the task adequately 

Twenty normal Ss were drawn from the medical and 
surgical wards of a VA general hospital. Case records 
were scrutinized and ward doctors and nurses consulted 
to choose patients who were free from major psychiatric 
symptoms. Patients whose diagnoses included dis- 
orders usually considered to be psychosomatic were 
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not used. No normals were eliminated because of failure 
to do the tasks adequately. 


The Size Estimation Task 


The apparatus for the size estimation task has been 
described in detail elsewhere (13). The stimuli were 
negatives (white lines on black background) projected 
by a 10-watt point source of light on a translucent 
screen large enough (22” x 32”) to minimize framework 
cues. The distance of the stimulus slides from the light 
source (and hence the size of the image) was adjustable 
by either of two knobs. One knob was at the back of 
the apparatus and was for the use of Z; the other pro 
truded just below the screen and was operated by S, 
who sat in front of the screen. A plywood shield ex 
tended across the base of the screen and over S’s knob to 
shield it from his view. The use of a point source of light 
insured a focused image of constant brightness on the 
screen at all sizes. The image on the screen could be 
turned on and off by means of a shutter. 

Changes in the position of the carriage in time were 
recorded by means of a pen in a holder hinged to a brass 
rod extending from the carriage to a gridded chart which 
was driven by a continuous feed kymograph. 

The Ss estimated the sizes of the pictures from im- 
mediate memory by the method of average error. On 
each trial, E said, ““Here’s the regular size,”’ and flashed 
the picture on for about five seconds by count. Then he 
shut it off, changed the position of the carriage, and 
re-exposed the picture at a different size saying, ‘““Now 
change it,” as he did so. Approximately four seconds 
elapsed between exposures. When S had adjusted the 
yicture to suit himself, he indicated so verbally, and 
he picture was shut off. 

Each picture was judged eight times by this method. 
The size at which the picture was re-exposed (offset) 
varied from trial-to-trial in a predetermined order, 


t 
t 
t 


constant for all Ss on all pictures. There were four 
different ofisets, two in each direction. For each of the 
first four trials and for each of the last four trials for 
any one picture, a different offset was used 


The Discrimination Task 

In this task, each S was required to discriminate 
standard from variations for two pictures. The ,ap 
paratus included two Kodaslide projectors, one of which 
had an Alphax shutter mounted on it, and a beaded 
screen. On any given trial, E exposed the standard on 
the screen for two seconds; this was followed by a two 
second wait, and then the variable picture was pre- 
sented for 4¢ second. S indicated whether he thought 
the two pictures the same or different by pushing or 
pulling a Mallory switch which was appropriately 
labeled. E reinforced S by lighting up one of the two 
boxes, one labeled “right” and the other “wrong,” 
placed on the floor just below the screen. The front side 
of each box consisted of a sheet of translucent plastic 
on the back of which was printed either “right” or 
“wrong” in black letters 1-34" high. A 15-watt incan 
descent bulb was in each box. The words were faintly 
visible when the boxes were unlighted and stood out 
clearly when they were lighted. 

Each S was given 30 trials on this task, 15 on each 


of two pictures. Trials on the two pictures were mixed 
in a predetermined irregular order with not more than 
two consecutive trials on the same picture. For one of 
the pictures a discrimination was possible, the variable 
picture being different from the standard on two-thirds 
of the trials, and S was rewarded (flashed “right’’), 
on 13 of the 15 trials, and punished on only two trials 
regardless of his responses. It was hoped that this 
discrimination would be fairly easy and that most Ss 
would really master it. For the other picture, no discrimi- 
nation was possible, the standard and variable stimuli 
being the same on all trials, and S was told “wrong” 
on 13 of the 15 trials and “right” on two trials regardless 
of his responses. These two pictures are referred to as 
the “rewarded” and “punished” pictures, respectively. 
There were five stimulus pictures: a square and four 
pictures with meaningful content. Two of the content 
pictures contained objects: one, a house and a tree 
(H-T), the other a lamp and a table (L-T). The other 
two pictures depicted a woman and a small boy. In the 
“scolding” (Sc) picture, the woman had her arm out 
stretched, and the boy stood with his head down.® 
In the “feeding” (Fe) picture, the woman held a pitcher 
outstretched and the boy a glass in his outstretched 
hand. The variations of these pictures used in the 
discrimination task were the same as the standard 
except in the positions of one element. In the house-tree 
picture, the angle of a limb varied; in the lamp-table 
picture, one leg was in different positions; on the 
mother-son pictures, the angle of the mother’s arm 
varied. The pictures were designed to be structurally 
similar to a degree, all containing one tall and one short 
figure, and the two mother-son pictures were the most 
similar. The heights of the taller figures were equal. 


Procedure 


When S was seated before the size estimation appara- 
tus in the experimental room, he was told that Z was 
interested in how well people could judge the sizes of 
different pictures. The instructions described the se- 
quence of events involved in each judgment, and S$ 
was urged to make the picture as close to the regular 
size as he could and told that he could take his time in 
doing so. Z then darkened the room and gave S at least 
two practice trials with the square, reiterating parts of 
the instructions during this process. If S’s two practice 
adjustments met a predetermined criterion of accuracy, 
a regular series of trials was begun. If S’s two practice 
adjustments were inaccurate, more practice was given 
until he improved sufficiently or until it became ciear 
that his performance would probably never be valid 
Each S then judged the sizes of the square and one 
content picture, then was given the discrimination task, 
and finally judged the two pictures seen in the discrimi- 
nation task and one other. 

The following descriptions were given of the pictures 
just before S was presented with them for the first time, 
whether in the size estimation or discrimination tasks: 
H-T: “Here is a picture of a house and a tree.” L-T: 
“Here is a picture of a lamp and a table.” Sc: “Here is a 


* The H-T and Sc pictures are based on those used by 
Dunn (9). 
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picture of a mother scolding her little boy. He has done 
something bad and she is bawling him out for it.” 
Fe: “Here is a picture of a mother and her little boy 
She’s about to pour him a glass of milk out of the 
pitcher.” Ac the end of the session, S was highly praised 
for his performance on both the size estimation and 
discrimination tasks. His failure on the punished picture 
was discounted as “just one of those things,’ and he 
was told that it was a very tricky picture anyway and 
that many people did not do well on it. 


Experimental Design and Treatment of Data 


Each main subject group was divided into four 
subgroups, differing in the order in which the pictures 
were presented and in which pictures were used in the 
discrimination task. Table 1 shows this sequence and 
which pictures were judged after having been rewarded 
(R) and punished (P). This design was identical for 
each subject group. The N for each condition for the 
schizophrenics is 3 except for the Goods in Condition 1, 
where N = 4, and the Poors in Condition 4, where NV = 
2. For the Normals, N= 5 for each subgroup. 

The measure of size estimation analyzed here is the 
distance of the slide from the standard position, hence 
the distance of the slide from the screen. This measure 
was used in order to approximate an equal discrimina- 
bility function. The relationship between this measure 
and the logarithm of the height of the image on the 
screen closely approaches linearity. The score for any 
given S is the mean of 8 of these values, except where 
otherwise indicated. A positive value refers to a judg- 
ment larger than the standard size, and a negative value 
to a smaller estimation. 


RESULTS 
Reward-Punishment Treatments 


Group means of the size estimation means 
of the rewarded and punished pictures for all 
conditions and for total are presented in 
Table 2. It is evident that, contrary to ex- 
pectation, all groups show a general under- 
estimation tendency except for the Goods on 
the punished picture. Group differences were 
evaluated statistically by analyzing the dif- 
ference between the means of the punished 
and rewarded pictures for each S. The break- 
down included first four vs. last four trials, 
neutral vs. mother-son pictures, and order of 


TABLE 1 
EXPERIMENTAL DESIGN: IDENTICAL 
FoR Eacu Suspyect Group 


a. Sequence of Pictures Judged 
1 | Fe | Rwd.(H-T) | Pun.(L-T) | Sc 
2 Sc Pun. (H-T) | Rwd. (L-T) Fe 
3 H-T | Rwd. (Fe) Pun. (Sc) L-T 
4 L-T Pun. (Fe) H-T 


Rwd. (Sc) 


P. ZAHN 


TABLE 2 
Size Estrmation MEANS IN TERMS OF DEVIATIONS 
FROM STANDARD SIZE FOR THE VARIOUS 
Groups AND CONDITIONS 





Good Premor-| Poor Premor- 


. bid Schiz bid Schiz. Normals 
Condition 
Rwd. | Pun. | Rwd. | Pun. | Rwd. | Pun. 
Rwd. on H-T 
Rwd. Judged First |—.192 160 |\—.102 |—.323 |—.154 |—.048 
Rwd. on L-T 
Rwd. Judged —.054 108 |—.090 |—.027 |—.153 —.186 
Second 
Subtotal —.133 139 |—.096 |—.175 |—.154 |—.117 
Rwd. on Fe 
Rwd. Judged First |—.521 258 |—.273 |—.083 |—.294 |—.234 
Rwd. on Sc 
Rwd. Judged 048 104 |—.219 |—.460 |—.300 |—.254 
Second 
Subtotal —.236 |—.077 |—.251 |—.234 |—.297 |—.244 
Total —.181 039 |—.166 |—.202 |—.226 |—.180 


estimation of the two pictures as well as 
groups. An analysis of variance, a modification 
by the addition of a fourth dimension of 
Lindquist’s Type III design, was carried out 
on these difference scores after Bartlett’s test 
had revealed no significant heterogeneity of 
variance.* The variance attributable to groups 
was significant (F = 3.64, df = 2, 32, p < .05), 
and ¢ tests of differences between individual 
means show that the Goods differed signifi- 
cantly from the Poors (p < .05), the Goods- 
Normals difference approached significance (p 
< .10), and the Poors and Normals did not 
reliably differ (p > .20). 

The hypothesis of a difference between the 
reward and punishment treatments was tested 
for each of the groups by comparing the P-R 
differences to zero. It was found that the 
Goods significantly (p < .01) judged the 
punished picture larger than they did the re- 
warded one, but the differences between the 
two treatments for the Poors and Normals 
were not significant. It is clear that the sig- 
nificant difference between groups comes from 
the differential impact of the reward and 


*To meet the requirement of proportionate cell 
frequencies on this and subsequent analyses, the score 
closest to the mean was eliminated from the larger cell 
and the mean of the smaller cell added to that cell. This 
latter procedure necessitated the subtraction of two 
df from total and from error (w) in the analysis of 
variance. 
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punishment conditions on the Goods as con- 
trasted with no appreciable effects in the 
other two groups. The only other source of 
variation in the table to reach significance 
was the trials X order interaction (p < .05). 
This reflects a tendency for all Ss to under- 
estimate rather greatly on the first trial of the 
first picture after the discrimination task. 

The differences between the R and P condi- 
tions were, in general, less on the last four 
trials than on the first four. On the first four 
trials, in fact, the Normals showed a signifi- 
cant (p < .05) overestimation of the punished 
picture relative to the rewarded one, as did 
the Goods. Although, in view of insignificant 
interactions involving trials in the larger 
analysis, this result can be looked on only as 
suggestive, it indicates that the reward-punish- 
ment variable may have had a transitory 
effect on the size estimations of the Normals, 
but that this effect dissipated in this group 
more rapidly than for the Goods. 


Feeding and Scolding Content 


Mean size estimation scores for the Feeding 
and Scolding pictures for the subgroups which 
were not given the reward—punishment treat- 
ments on these pictures are shown in Table 3. 
The means are strikingly similar to those for 
the treatments, showing a general underesti- 
mation tendency holding for all but the Goods 
on the Scolding picture. Means for the sub- 
groups which were “treated” on the mother- 
son pictures are also shown, as well as means 
for all four conditions. A 3 X 4 X 2 Type III 
analysis of variance on the means for all four 
conditions shows that the groups variance 
approaches significance (F = 3.13, df = 2, 
32, p < .07). Examination of Table 2 shows 
that this lack of significance may well be due 
to the confounding effects of the punishment 
on the feeding picture on the last condition. 
An analysis of the means for only those condi- 
tions in which the mother-son pictures were 
not involved in the treatments would seem to 
provide a less contaminated test of the hy- 
pothesis. Consequently, a 3 X 2 X 2 Type 
III analysis of variance was carried out on 
these untreated means. The variance attrib- 
utable to differences between groups is sig- 
nificant (F = 3.69, df = 2, 16, p < .05), 
indicating that the content of the pictures has 
a differential effect on these three groups. 


TABLE 3 

MEAN Size ESTIMATES OF THE FEEDING AND 
ScoLDING PicrurES WHEN Nor SuBjJecTED 

TO TREATMENTS 


Good Premor-| Poor Premor- 
bid Schizo- bid Schizo- Normals 
Order phrenics phrenics 





Fe Sc Fe | Sc Fe Se 





Fe-Sc —.397 |—.008 |—.192 |—.323 |—.198 |—.121 


Sc-Fe —.021 -113 |—.139 |—.019 |—.095 |—.246 
Mean — .236 -044 | —.166 |—.171 |—.146 |—.184 
Mean (Treated —.208 —.105 |—.348 |—.137 |—.274 |—.267 
Subgroups) | 
Grand mean —.223 |—.025 |—.249 |—.156 |—.210 |—.226 


Differences between the separate groups were 
tested by ¢ tests, which resulted in the findings 
that the difference between the two schizo- 
phrenic groups approaches significance (¢ = 
2.09, p < .06), the Goods differ significantly 
from the Normals (¢ = 2.61, p < .05), and the 
Poors and Normals are not significantly dif- 
ferent. Here again it is the Goods who manifest 
a reaction to the experimental variables. 

This analysis was based on differences be- 
tween the picture judged second in the series 
and the one judged last (fifth). Because the 
intervening experience with the reward and 
punishment treatments may have influenced, 
by some kind of carry-over effect, the size 
judgments of the last picture, the judgments 
of the second picture only were studied. A 
type III analysis of variance, including groups, 
pictures, and trials, showed a significant groups 
X pictures interaction (F = 3.73, df = 2, 16, 
p < .05), which confirms the results of the 
other analysis. Individual ¢ tests show that the 
Goods were the only group to judge the pic- 
tures differently from one another (¢ = 4.48, 
p < .005). 

An examination of Conditions 3 and 4, 
where the treatments were given on the 
mother-son pictures, indicates something of the 
interaction between the two kinds of affective 
meaning. For the Goods on Condition 3, where 
the treatments and content are congruent, 
the difference between the size means for the 
two pictures is not markedly different from 
the conditions where the variables operated 
singly. This indicates the absence of any sum- 
mative process that might be expected. On 
the fourth condition where treatments and 
content are incongruous, the difference be- 
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tween pictures is much the smallest of the 
four conditions, indicating that the two 
sources of negative affective meaning can- 
celled each other out. The difference between 
differences on these conditions is not 
significant (p < .11), however. The fact that 
both pictures are rather extremely underesti- 
mated on Condition 3 and slightly overesti- 
mated on Condition 4 cannot be stressed since 


two 


these same Ss showed somewhat comparable 
effects on the neutral pictures. 

The data for the Poors reveals that for 
both these conditions the Scolding picture is 
underestimated much than Feeding, a 
difference significant at the .05 level by ¢ test. 
Previous experience with these pictures in the 


less 


discrimination task, regardless of its nature, 
may have enhanced the symbolic value of 


their content for the Poors. 


Neutral Pictures 

[he size judgments of the square and of the 
untreated neutral pictures were analyzed to 
test the hypothesis of differential over-all size 
estimation tendencies. In neither case do the 
differences between subject groups approach 
significance, although the Poors show a smaller 
underestimation trend on the untreated neutral 
pictures.® In addition, there were no marked 
differences between the two neutral pictures 


for any group, despite relatively marked 
differences in structural properties. This indi- 
cates that the differences between the two 


mother-son pictures demonstrated above is 
probably not due to such structural factors. 


Other Aspects of the Adjustment Process 


Another important aspect of the method of 
average error is the consistency with which 
the judgments are made. This is given by 
the variability of a given set of individual 
judgments around their mean (variable error). 
This statistic was computed for each S on 
each picture. The over-all means of this 
measure for all Ss on all pictures are .459 for 
the Goods, .464 for the Poors, and .482 for the 
normals. It is interesting to note that these 
over-all differences are very small and that 
variability is slightly greater for the normals 

* The means for the Goods, Poors, and Normals for 


the untreated neutral pictures are —.228, —.078, and 
267, respectively (compare with Tables 2 and 3) 


than for: the schizophrenics. No consistent 
relationships between variability and either 
the size estimation mean or the affective value 
of the stimulus could be demonstrated. 

Two kinds of data were quantified from the 
graphic records of the Ss’ adjustments: the 
time taken in making the adjustments, and 
changes in the direction of adjustment. For 
the time of adjustment, the over-all averages 
show that the Goods took an average of 10.07 
seconds per trial, the Normals 7.50 seconds, 
and the Poors 7.68 seconds.® A simple analysis 
of variance shows the difference significant 
(F = 3.93, p = .05, df = 2/39) and that the 
Goods differ from both the other groups. 

A change in the direction of adjustment 
refers to the cases where S overshot the mark 
and moved the image back in the opposite 
direction from the way in which he had been 
moving it. It was thought that this, as well 
as the time variable, might reflect a hesitancy 
in coming to a decision. The mean number of 
trials per picture on which one or more of 
these direction changes occurred is 3.01 for 
the Goods, 2.28 for the Poors, and 1.24 for 
the Normals. The three groups were signifi- 
cantly different (F = 6.68, p = .005, df = 
2/40). The goods differed from the Normals 
(p < .001), but the Poors were only sug- 
gestively different from the Normals (p < .10) 
and not different from the Goods. For both 
these variables, the differences between pic- 
tures for the three groups were quite small and 
nonsignificant. 

DISCUSSION 

The results on size estimation, although 
they are contradictory of the specific research 
hypotheses that were advanced, confirm the 
more general hypothesis and research finding 
that the two schizophrenic groups differ in 
reactions to stressful or threatening situations 
(3, 13, 17). 

In addition, the experiment demonstrates 
that having been consistently censured in re- 
sponding to a stimulus is a variable influencing 
the estimation, from immediate memory, of 
the size of the stimulus, at least for one class 
of S. It also suggests a functional equivalence 


6 The mean for the Poors omits one extreme S whose 
time of adjustment was over twice as long as that of the 
next most extreme case. This was done on the basis of a 
test for extreme scores (7, p. 243). 
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between the experimental application of cen- 
sure and the symbolic representation of a 
censorious mother-son relationship. 

In view of the failure to find consistent 
differences in over-all size estimation tend- 
encies for the three groups, Klein’s control 
hypothesis seems inapplicable to the results of 
the present study. According to the accentua- 
tion hypothesis, however, the interpretation 
appears at least tentatively justifiable that a 
reliable relative overestimation in the present 
experiment indicates the picture so judged 
had more personal meaning for the Ss than had 
pictures not so judged. The mechanisms by 
which an attitude is translated into a change in 
estimated size remain obscure. 

A question of more primary concern here is 
what mechanisms underlie increased personal 
relevance. The discrimination task was de- 
signed to present the Punished picture in close 
conjunction with a negative reinforcement. 
Thus, the picture would be expected to acquire 
secondary reinforcing properties and to evoke 
the same kind of reaction as is produced by 
the punishment itself. This reaction is prob- 
ably emotional in nature. The magnitude of 
this emotional reaction produced by the pun- 
ished picture is probably greater in persons 
characterized as having a high degree of 
anxiety or affect. These considerations lead 
to the hypothesis that the Goods used in the 
present experiment are characterized by a 
higher degree of anxiety or affective sensitivity 
than are the Poors. 

The same kind of reasoning may be applied 
to the mother-son pictures. Here the emotional 
reactions to situations of the kind pictured by 
the Scolding picture are assumed to have al- 
ready been conditioned by virtue of the harsh 
treatment schizophrenics are thought to have 
received from their parents. The picture, then, 
presumably serves as a cue to evoke, at least 
partially, the same kind of reaction as was 
engendered by the situation it represents. The 
strength of this reaction is probably a function 
both of the experience of S with his mother 
and of the amount of anxiety or affect charac- 
teristic of him. It is interesting to note that the 
Goods gave evidence of as great a change in 
size estimation to the Scolding picture as to 
the Punished picture. Although the Poors were 
relatively unresponsive to the treatment vari- 
able (in terms of size estimation changes), 


on 


they showed evidence of being sensitive to the 
mother-son pictures under some conditions. 

The relatively long time of adjustment and 
large number of direction changes found for 
the Goods may indicate a greater motivation 
to succeed in a task. This motivation may, in 
turn, be functionally related to anxiety. 

The results for the patients taking tranquil- 
izing drugs seemed to provide some evidence 
for the anxiety hypothesis. It was found that 
the differences in the Goods from the other 
groups were not present in the patients taking 
drugs (18). Previous clinical and experimental 
studies on the effects of chlorpromazine on 
psychiatric patients have noted a lessening of 
clinically manifested and subjectively felt 
anxiety (1, 11) and pathological affect (19) 
after a period of administration of the drug. It 
seems reasonable, then, to assume that the 
dependent variables in the present experiment 
on which differences occurred between no-drug 
and chlorpromazine Ss may be influenced by 
motivational states such as anxiety. In other 
words, the fact that the effects of the punish- 
ment and Scolding content were absent for 
those patients taking chlorpromazine indicates 
that the effects observed in the no-drug groups 
were a function of a relatively high state of 
anxiety or affective responsivity. However, 
this argument is weakened by lack of control 
over the sampling of the patients taking drugs. 

The results presented here conflict with pre- 
vious studies which show greater reactions to 
stressful and threatening situations in poor 
premorbid schizophrenics than in Goods or 
Normals, and specifically with the results of 
Harris’ study (13) and the preliminary experi- 
ment which showed a relative general overesti- 
mation tendency by the Poors. The experi- 
mental situation of the present study, although 
somewhat different from that of Harris, was 
quite similar to that of the preliminary study. 
This makes it somewhat unlikely that differ- 
ences in technique are responsible for the con- 
flicting results of the present study, although 
this possibility cannot be entirely ruled out. 
It would seem more probable that the differ- 
ence is a function of the sampling of the Ss. 
One striking difference stands out in this 
respect—the mean length of time from the 
first psychiatric hospitalization of the patients. 
In the present experiment, this was 6.00 years 
for the Goods and 8.09 years for the Poors, as 
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compared with about three years for Harris’ 
total group and less than that for the Ss used 
in the preliminary experiment. Thus, it can be 
inferred that the patients in the present study 
were drawn from a different population as far 
as this chronicity variable goes. 

There seem to be two ways in which this 
chronicity variable could work to produce a 
different sampling of patients. First, there 
may be some effect of chronicity per se on be- 
havior in the size estimation task. Second, 
sampling from a chronic population would lead 
to choosing patients with poor prognoses and 
miss patients who have improved and [eft the 
hospital. Also, more formerly available pa- 
tients would have deteriorated to the point of 
unsuitability as experimental Ss. 

It is clear that the Goods in the present 
study were atypical from the standpoint of 
prognosis since this kind of patient has been 
found to improve more readily than Poors 
(10, 16). It may be speculated that their sen- 
sitivity to stress manifested here has had some 
relation to their demonstrated lack of im- 
provement. The sensitivity to the Scolding 
picture may indicate that these patients had a 
more typical Poor premorbid familial pattern. 

For the Poors, on the other hand, the grad- 
uai progression of the disease may be respon- 
sible for the lack of effect of the motivational 
variables on size estimation. A concomitant of 
long hospitalization in chronic schizophrenics 
is a decreased sensitivity to the external and 
internal environment, usually described as 
withdrawal, decreased motivation, or increased 
apathy (2, 14). Such a general decrease in 
affective responsivity may be responsible for 
the negative findings for the Poors. The two 
years’ difference in hospitalization time would 
seem not to be enough, however, to account 
for the differences between the two premorbid 
groups. It is possible that the difference in 
premorbidity may manifest itself (in patients 
who do not leave the hospital) in differential 
rates of progression of the illness. 

However, these speculations on the effects 
of hospitalization length are not confirmed by 
the present data. No consistent differences in 
performance can be found in either group by 
separation into subgroups on the basis of 
length of hospitalization. This cannoi be re- 
garded as a crucial test, however, since even 
the patients in the short hospitalization groups 


in this breakdown were psychotic for a longer 
time than Ss used in previous studies. 

The possibility cannot be ignored that the 
Poors, in view of their chronicity, were com- 
pletely “out of” the tasks and behaved in a 
more or less random manner. However, this 
seems unlikely since they were comparable to 
the other groups in terms of the intra-individ- 
ual variability of the size judgments and in the 
magnitudes of the constant errors on most of 
the pictures. Also, in the discrimination task 
on the punished picture, they gave significantly 
more reversals of response after a “‘wrong.’” 
These reversals have been interpreted by 
Rodnick and Garmezy (17) as indicating at- 
tempts by S to avoid punishment. This sug- 
gests that the punishment had some effect on 
the behavior of the Poors.* The failure of this 
effect to manifest itself in the size estimations 
of the Poors may itself be a consequence of 
strong avoidance tendencies in these Ss. In 
view of their long history of avoidance of and 
withdrawal from interpersonal situations, it 
may be speculated that they were able to avoid 
the affective properties of the stimuli; that is, 
they were able to avoid getting involved in the 
size estimation task. 


SUMMARY 


The present experiment investigated the 
size estimations from immediate memory of 
pictures imbued with two kinds of affective 
meaning by schizophrenics with good pre- 
morbid adjustments, schizophrenics with poor 
premorbid adjustments, and normals. Positive 
and negative affective meanings were built 
into two pictures by means of a prior task in- 
volving differential reinforcement (right and 
wrong) of S’s responses to them. In addition, 
size estimates were made of pictures depicting 
a mother scolding a boy and feeding a boy. The 
good premorbid schizophrenics, in contrast to 
the other two groups, significantly overesti- 
mated the sizes of both the picture associated 
with “wrong” and the Scolding picture, rela- 
tive to the rewarded and Feeding pictures. 
The results are interpreted in terms of a high 
degree of anxiety or affective responsivity in 

7A reversal is said to occur if S responds “same” on 
one trial and “different”. on the next or vice versa 

8 There were no significant differences between the 
three groups in number of correct respons:s to the 
rewarded picture 
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the Goods and the predominance of avoidance 
and withdrawal mechanisms in the Poor. They 
also demonstrate the possible fruitfulness of 
the size estimation technique in discovering 
affective variables influencing the behavior of 
schizophrenic patients. 
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MENTAL ILLNESS, MILIEU THERAPY, AND SOCIAL ORGANIZATION 
IN WARD GROUPS 


EDWARD J. MURRAY anp MELVIN COHEN! 
Walter Reed Army Institute of Research 


ULTURALLY oriented personality theo- 
rists such as Sullivan have suggested 
that mental illness, broadly conceived, 

is related to difficulties in interpersonal rela- 
tions (6). Fromm-Reichmann (2) views mental 
illness as a condition involving a withdrawal 
from social relationships because of fears of re- 
jection and feelings of inadequacy which can 
be alleviated by a therapeutic process involv- 
ing insight, emotional discharge, and respect 
by the therapist. This general theory of mental 
illness would lead one to predict that the 
greater the degree of mental illness of a patient 
the more his social relationships with his fellow 
patients in a mental hospital will show various 
disturbances such as withdrawal. McMillan 
have presented evidence 


> 


and Silverberg (3) 
which generally supports this prediction. 

A continuum of degree of mental illness was 
obtained by McMillan and Silverberg by rank- 
ing five wards according to decreasing level of 
adjustment: (a) neurological, (6) gastro- 
intestinal medical, (c) open psychiatric, with 
neurotics and psychotics, (d) insulin therapy, 
with anxiety neurotics, and (e) closed psychi- 
atric, with psychotics. Sociometric choice pat- 
terns were then compared with three results. 
First, there was a trend suggesting that the 
greater the degree of mental illness, the fewer 
reciprocal choices, which was, however, sig- 
nificant only for the insulin ward. Second, there 
was a slight but not significant trend suggest- 
ing that the more disturbed patients concen- 
trated their negative choices on fewer people. 
Third, there was a striking tendency for pa- 
tients on all wards to have more positive than 


‘ The authors wish to thank Stanley Pliskofi, who 
helped with some of the statistical analysis, and Har- 
old L. Williams and Ardie Lubin for reading the manu- 
script and making valuable suggestions. The authors 
also wish to thank the members and 
patients of Walter Reed Army Hospital for their co- 
A portion of this study was 


numerous staff 


operation in this study 
presented at the American Psychological Association 
meeting in New York in August, 1957. The senior au 
thor is now at Syracuse University and M. Cohen is 


at New York 


University 


negative reciprocations. While these results 
generally tend to support the hypothesis, they 
are not all as clear and consistent as one would 
expect. 

The purpose of the present study was to 
apply the McMillan and Silverberg technique 
to a somewhat more easily ranked group of 
wards and to employ somewhat simpler meas- 
ures in the hope of obtaining a more clear-cut 
test of the hypothesis that the greater the 
degree of mental illness, the greater the dis- 
turbance of social relationships. An additional 
purpose was to take an initial step in deter- 
mining the effect of milieu therapy on group 
organization in psychiatric wards. 


METHOD 


Major Ward Groups 


A total of 132 patients in three wards at a large 
army hospital was used in the main part of the study 
Nearly all of the patients were enlisted men. The 
control medical ward was a large orthopedic ward with 
a total of 40 patients, 5 of whom were omitted from the 
study because of absence or language difficulty. The 
majority of these patients had limb fractures or ampu 
tations but were ambulatory and able to socialize. The 
ward could be conveniently divided into front, back, 
and porch areas. The open psychiatric ward contained 
51 patients, of whom 2 were omitted and 15 refused to 
participate. The diagnosis of the patients included 
various neuroses, nonacute psychoses, and character 
disorders. The men in this ward were considered by 
the staff to be almost well-adjusted enongh to be re- 
turned to duty or discharged to their own care. If they 
became disturbed they were returned to the locked 
ward.. The men had ground privileges and frequent 
weekend passes. There were common eating and recrea 
tional facilities, but the men slept in three separate 
areas. The locked psychiatric ward contained 41 patients, 
13 of whom were omitted because of language difficulties 
or because they were not in contact enough to fill in 
their own names and identifying data on the question- 
naire. This ward contained more acute psychotics and 
severe neurotics as well as patients who had not been 
fully diagnosed. The staff felt that these patients were 
considerably more disturbed than the patients in the 
open psychiatric ward. The patients were locked in the 
ward and escorted to all activities and examinations 
They were not allowed passes. The ward had three 
sleeping areas as well as a central day room 
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Additional Ward Groups 


In addition to the three wards used in the main 
part of the study, three smaller wards were also used 
to study the effects of milieu therapy on group organiza- 
tion. The nonpsychiatric control ward was an experi- 
mental ward used for long-term studies of metabolic 
diseases. At the time of the initial sociometric question- 
naire administration, there were eight patients on the 
ward. These patients were ambulatory but were re- 
stricted to the ward in order to control diet rigidly and 
to measure bodily waste materials. The milieu therapy 
ward was an experimental psychiatric ward for the 
long-term treatment and study of schizophrenia. The 
patients all had been hospitalized during basic train- 
ing. On the ward, the patients were put in a thera- 
peutically oriented milieu (4), including group therapy 
three times a week, varying hours of individual therapy 
with the psychiatrist, and frequent therapeutically 
oriented contacts with the nurse, social workers, and 
psychiatric aides all coordinated by weekly staff con- 
ferences. The staff was especially selected for empathic 
attitudes and high motivation. None of the patients 
received drugs, shock, or other somatic therapy. There 
were nine patients on the ward at the time of the first 
administration of the sociometric questionnaire. The 
somatic therapy ward was a typical closed psychiatric 
ward for disturbed patients. The patients were re- 
ferred from smaller army hospitals and showed a 
greater age, rank, and diagnostic range than the ward 
just However, most of the patients were 
schizophrenic and all were confined to the locked ward 
The patients were being treated with a long series of 
electroconvulsive shocks, a series of insulin comas, or 
massive daily doses of tranquilizing drugs. Very little 
psychotherapy was employed. The staff was competent 
and pleasant but not oriented towards milieu therapy. 
There were 15 patients on the ward at the time of the 


described 


first sociometric questionnaire 

From four to eight months after the first adminis 
tration of the sociometric questionnaire, the question- 
naire was given again to the same three wards. By this 
time, all of the patients on the milieu therapy ward 
and the somatic therapy ward and all but three of the 
patients on the control ward had been replaced by new 
patients. At this time there were seven patients on the 
control ward, six on the milieu therapy ward, and 
eight on the somatic therapy ward. A few patients on 
all wards were omitted both times because they were 
on the ward for less than a week. The average length 
of stay was about equal on all wards. 


Sociometric Questionnaire 


The questionnaire was mimeograpaed in a booklet 
clearly marked confidential. The patients were asked 
to give their name, ward number, and other identifying 
data and were omitted if they could not reasonably 
fill out this section. Then, the patients were asked 
simply to list all of the names of the patients on the 
ward that they could remember. First names and 
nicknames were accepted as well as last names. Finally, 
the patients were asked to make choices on twelve 
sociometric items. They were asked to select at least 


one person for each item but allowed to write in as 
many more as they wished. There were four spaces 
for names under each item but the patients were allowed 
to write in more if they wished to. The sociometric 
questionnaire included: four positive items—like most, 
at ease with most, like to eat with most, like to work 
with most; four negative items—like least, ill-at-ease 
with, like to eat with least, like to work with least; 
and four neutral items—tallest patients, heaviest 
patients, most intelligent patients, and _best-liked 
patients. Later evidence suggested that the neutral 
items, particularly the last two, were influenced by the 
patients’ likes and dislikes. The 12 sociometric items 
were mimeographed in a random order. 


Procedure 


Arrangements were made with the staff ahead of 
time to insure maximum attendance at a group ad- 
ministration of the questionnaire on each ward. The 
senior author explained the general nature of the 
questionnaire and emphasized its confidential nature. 
The booklet and pencils were handed out, and the 
patients worked at their beds or in the day room. Com- 
munications between the patients were kept to a 
minimum. All patients who were absent at the group 
meeting or who had refused to fill out the questionnaire 
were contacted individually and asked again to co- 
operate. 


Social Background Factors 


After the administration of the sociometric question- 
naire, the hospital records of each patient were ex- 
amined and information obtained about age, race, 
rank, length of service, marital status, education, 
population area of origin, branch of service, and 
religion. Information was also obtained about the 
length of time in the hospital and on the ward, diag- 
nosis, and ward bed assignment. 


RESULTS AND DISCUSSION 


Degree of Mental Illness and Sociometric Choice 


A number of sociometric measures were 
directly related to the degree of mental illness 
represented by the three major wards. This 
can be seen graphically in Fig. 1 where the 
sociograms based on _ positive reciprocal 
choices are shown for the control medical, open 
psychiatric, and locked psychiatric wards. Re- 
ciprocals are based on a mutual positive choice 
regardless of which of the four items was in- 
volved. These three wards form a continuum 
of degree of mental illness about which there 
can be very little disagreement. The socio- 
grams indicate that as degree of mental illness 
increases there is a decrease in the complexity 
of the social organization as well as a decrease 
in the number of reciprocals. As Table 1 shows, 
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Fic. 1. Positrve RecrprocaL CHOICES FOR 
Turee Warp Groups REPRESENTING A 
CONTINUUM OF DEGREE OF MENTAL 
ILLNESS 


the percentage of patients with one or more 
reciprocals shows a significant variation over 
all wards (p < .01). (Unless otherwise stated, 
significance levels in this study are based on a 
one-tail chi-square test.) There are fairly sig- 
nificant decreases from the control medical to 
the open psychiatric (p < .07) and from the 
open psychiatric to the locked psychiatric 
ward (p < .05). The percentage of patients 
who had one or more negative reciprocals was 
significantly lower than for positive recipro- 
cals for all wards (p < .01). However, the 
same trend as with positive reciprocals was 
evident with fewer people with negative re- 
ciprocals as degree of mental illness increased 
(p < .02). This suggests a general withdrawal 
rather than a differential effect of mental ill- 
ness on positive and negative aspects of inter- 
personal relationships. 

It is of interest to examine the number of 
reciprocal choices as well as the number of pa- 
tients involved, since the number of reciprocal 
choices may reflect the complexity of the social 
organization more faithfully. For example, 
there may be a reciprocal choice between A 
and B and between B and C involving three 
patients altogether. However, this is a simpler 
organization than if there were also a reciprocal 
choice between A and C, which would increase 
the number of reciprocals but not the number 
of patients with one or more reciprocals. But 
in making such a comparison, it must be kept 
in mind that generalizations are limited to a 
population of not fully independent choices. 
It is felt that in this case such a comparison is 
useful and legitimate since the finding has 
already been established for a population of 
patients. The percentages of both positive and 
negative choices which were reciprocated in 
the control medical, open psychiatric, and 


TABLE 1 


Various SocioMETRIC MEASURES IN WARDS 
REPRESENTING A CONTINUUM OF DEGREE 
OF MENTAL ILLNESS 


Degree of Mental Illness 


; Open Locked 
Ward | atric | atric 
Ward | Ward 


Sociometric Measure 


Percentage of patients with one | 74 | 50 25 
or more positive reciprocal 

Percentage of patients with one | 26 6 
or more negative reciprocal 

Percentage of positive choices | 44 
reciprocated 

Percentage of negative choices 
reciprocated 

Average number of 
known 

Average number of positive 
choices 

Average number of negative 
choices 

Average number of 
choices 

Percentage of patients receiv- 
ing no choices at all (total 
isolates) 


names 


neutral 


locked psychiatric wards are shown in Table 1. 
As degree of mental illness increases there are 
fewer reciprocal choices on either positive or 
negative items. The over-all decrease of the 
percentages of positive reciprocals was sig- 
nificant (p < .001). The difference between 
the locked psychiatric and the open psychi- 
atric wards was significant (p < .01) as was 
the difference between the locked psychiatric 
and the control medical wards (p < .001), but 
the difference between the open psychiatric 
and the control medical wards failed to reach 
significance on this measure. The percentages 
of the negative reciprocals showed an over-all 
decrease with degree of maladjustment 
(p < .01). The over-all difference between per- 
centage of positive and negative reciprocals is 
significant (p < .001) but appears to be about 
equal for all wards. 

Another way of showing this relationship is 
to simply compare the average number of names 
known on each ward. Table 1 makes this com- 
parison; the more disturbed the ward, the 
fewer the names known. The over-all differ- 
ence is significant (p < .01) as well as the 
difference between the control medical ward 
and the open psychiatric ward (p < .05) and 
the difference between the control medical 
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ward and the locked psychiatric ward 
(p < .01). The difference between the open 
and locked wards did not reach significance. 

Similarly, the average absolute number of 
choices on the sociometric items decreases 
with degree of mental illness. Table 1 shows 
over-all decreases in average positive, negative, 
and neutral choices which are all significant 
(p < .05). The over-ali average number of 
negative choices is significantly less (p < .01) 
than the average number of positive or neutral 
choices. 

The number of social isolates increases with 
degree of maladjustment (Table 1). There is a 
significant difference in percentage of patients 
receiving no choices between the control medi- 
cal ward and the two psychiatric wards com- 
bined (p < .01). The difference is accounted 
for almost entirely by a difference in the num- 
ber of isolates on positive sociometric items, 
since from 40 to 60% of patients on all wards 
had no negative or neutral choices, a value 
that is significantly higher (p < .001) than the 
20% who had no positive choices. 

An analysis of variance showed that the 
three groups did not differ significantly in 
average length of time on the ward or in the 
hospital. Furthermore, the various measures 
showed almost zero correlations with number 
of weeks on the ward or in the hospital. This 
result would indicate that the decrease in inter- 
personal relationships from control medical to 
open psychiatric to locked psychiatric wards 
was not due to different opportunities for 
socialization. 

It is also of interest to examine the degree to 
which the same person was chosen on several 
positive items or negative items. A score 
analogous to the criteria overlap score used by 
McMillan and Silverberg was obtained for 
each patient. There was no significant differ- 
ence between the three ward groups on either 
positive or negative items. However, there was 
an over-all tendency for greater overlap on 
positive than negative items (p < .01). These 
results are in agreement with those of 
McMillan and Silverberg. 

An acceptability score was also adapted 
from McMillan and Silverberg, consisting of 
the percentages of patients from all wards who 
were chosen various numbers of times on posi- 
tive, negative, or neutral items. As would be 
expected, most patients were chosen very few 


TABLE 2 
PERCENTAGE OF SOCIOMETRIC CHOICES FOR 
INDIVIDUALS SIMILAR TO SELF IN SOCIAL 
BACKGROUND VARIABLES 








|Degree of Menta! Illness 


Locked 
Psychi- 
atric atric 
Ward | Ward 
Age (within + 5 years) 64 46 
Race (Caucasian vs. Negro) 92 72 
Population area (metropolitan | 71 47 
vs. other) } 
Rank (Sgt. vs. Cpl. or Pvt.) 61 50 


Social Background Variable | content! Open 
Medical |Ps¥chi- 





Length of service (under or ov =r 55 50 
3 years) 
Marital Status (single vs. 56 44 
married) 
Education (under or over ninth . | 50 44 
grade) 
Geographical area (north vs. | 55 68 56 
south) 
Service (Army vs. Air Force) 68 52 64 
Religion (Prot. vs. Catholic) 56 59 59 





times if at all on any of the items, while a few 
people were chosen very often. But it was also 
found that fewer people received no choices at 
all on positive items than negative or neutral 
items. This result is contrary to McMillan 
and Silverberg’s finding of fewer patients with 
zero neutral choices. Other differences between 
positive, negative, and neutral choices were 
not significant. McMillan and Silverberg’s 
finding that more people had two positive 
choices than two negative or two neutral 
choices was not confirmed. Except for the dif- 
ferences already noted, there were no signifi- 
cant differences between wards. 


Sociometric Choice and Similarity of Social 


Background 

Another way of getting at the relationship 
between mental illness and interpersonal rela- 
tions is to examine the extent to which pa- 
tients on different wards choose fellow patients 
with social backgrounds similar to themselves. 
A number of social background variables that 
might be expected to influence sociometric 
choices are listed in Table 2. Each variable is 
dichotomized at some convenient point so that 
a 2 X 2 table could be set up to measure the 
degree to which patients made choices of fellow 
patients in the same category. In the control 
medical ward, for example, 80 of the 86 choices 
made by Caucasians were for Caucasians, and 
8 of the 11 choices made by Negroes were for 
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Negroes. Therefore, 88 out of 97 or 91% of 
the choices were for persons similar to self on 
the racial variable. The distribution of patients 
with these characteristics was about the same 
in the three wards. The general pattern of re- 
sults in Table 2 suggests that as mental illness 
increases, similarity in background 
variables plays less of a role in determining 
sociometric choices. On the control medical 
ward, four social background variables yielded 
significant percentages—age (p < .03), race 
(p < .01), rank (p < .02), and length of serv- 
ice (p < .04). On the open psychiatric ward, 
only three background variables yielded sig- 
nificant percentages—age (p < .O1), race 
(p < .01), and population area of origin 
(p < .01). On the locked psychiatric ward, no 
social background variables were significant. 
The locked ward percentage is significantly 
lower (p < .07 or better) than the other two 
wards for age, race, and population area of 
origin. Rank, length of service, marital status, 
and education show similar trends, but these 
are not significant. The results suggest several 
additional aspects of social behavior that might 
be influenced by mental illness. Caudill et al. 
(1) also report that on an intensively studied 
. muting of outer-world 


social 


ward there was a “ 
distinctions on the basis of race, ethnic group, 


or social class... . 
Sociometric Choice and Similarity of Diagnosis 


An effort was also made to evaluate socio- 
metric choices in terms of similarity in diag- 
nosis. When the data for the control medical 
ward were analyzed by amputation, fracture, 
and other groups it was found that 20% of the 
sociometric choices were within groups. The 


rABLE 3 
PROPINQUITY AS A FACTOR IN SOCIOMETRIC 
AND DEGREE OF MENTAL ILLNESS 


CHOICES 


Degree of Mental Illness 


Centnat Open |Locked 
Medical Psychi- | Psychi All 
W _ atric atric | Wards 
_— Ward | Ward 


sociometric Measure 


Percentage of positive | 44 | 45 | 54 46 
choices within own sec 
tion of ward 
Percentage of 


negative 42 32 38 39 
choices within own sec 
tion of ward 


psychiatric wards were subdivided into para- 
noid schizophrenic, other schizophrenic, char- 
acter disorder, neurotic, and other groups. On 
the open psychiatric ward, 40% of the choices 
were within groups, a value somewhat higher 
than the 21% on the locked psychiatric ward. 
When the psychiatric wards were categorized 
as schizophrenic and nonschizophrenic, 62% 
of the choices on the open ward and 46% on 
the locked ward were within groups. These 
differences are not significant, and these diag- 
nostic factors appear to be less important than 
some of the social background variables. On 
the other hand, Shipman’s finding (5) that 
paranoids tend to choose one another while 
schizoids do not was supported. When the 
choices of schizophrenics from both psychi- 
atric wards were examined it was.found that 
64% of the choices by paranoid schizophrenics 
were for other paranoid schizophrenics as op- 
posed to all other patients. Only 33% of the 
nonparanoid schizophrenic choices were for 
nonparanoid schizophrenics. This difference is 
significant (p < .05). 


Soctometric Choice and Propinquity 


A somewhat different kind of variable that 
might be related to mental illness is propin- 
quity, defined as the degree to which patients 
made sociometric choices from among those 
patients who slept in the same area as they. 
It will be recalled that each ward was divided 
into three sleeping areas. Table 3 shows the 
percentage of chcices within the patient’s own 
sleeping area on positive and negative items 
for the three ward groups. The percentage that 
would be expected if the choices were ran- 


MRENIC WAR YCHATRIC WAR 


MPHASIZING MIL IE EMPHASIZING SOMAT 


AND GROUP THERAPY 


Fic. 2. Posttrvr RecrPpROcAL CHOICES FOR 
A ContTrot MepicaL Warp, A LOCKED 
PSYCHIATRIC Warp EMPHASIZING 

MILiev AND Group THERAPY, AND 
A Lockep Psycuiatric Warp 
EMPHASIZING SOMATIC 
TREATMENT 


THERAPY 
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domly distributed throughout the ward is 
33%. There are no significant differences be- 
tween the wards on either positive or negative 
items. However, over all wards more positive 
choices are made within the patient’s sleeping 
area than would be expected by chance 
(p < .001). On the other hand, the percent- 
ages of negative choices are what would be 
expected by chance. The difference for all 
wards between the percentage of positive and 
negative choices within the area approaches 
significance (p < .06). There is an insignificant 
trend for this difference to increase with degree 
of mental illness. 


Milieu Therapy and Ward Organization 


An initial indication of the effect of milieu 
therapy on social relationships can be seen in 
Fig. 2 where the sociograms for the three 
smaller wards are compared. The social organi- 
zation on the nonpsychiatric control ward, as 
indicated by reciprocal choices on the “like- 
best” item, is fairly complex and involves 75 % 
of the group. On the other hand, the somatic 
therapy ward shows little social organization 
with reciprocal choices on the “like-best” item 
limited to 27 % of the group. The critical ward 
is the milieu therapy one shown in the center 
of Fig. 2. It can be seen that the milieu therapy 
ward has a complex organization involving 
78% of the group and therefore resembles the 
nonpsychiatric control ward much more than 
the typical closed psychiatric ward. The differ- 
ence between these percentages in the control 
and milieu therapy wards is not statistically 


TYPICAL 

PSYCHATRIC 
wena G t AS) ZIN MA 
AND GROUP THERAPY THERAPY 


Fic. 3. Positive RecrPROCAL CHOICES FOR A 
ConTroL MeEpicAL WarRD, A_ LOCKED 
PsycHIATRIC WARD EMPHASIZING MILIEU 
AND Group THERAPY, AND A LOCKED 
PsycHIATRIC WARD EMPHAS’ZING 
SoMATIC TREATMENT (BASED ON A 
SECOND SOCIOMETRIC QUESTION 
NAIRE AFTER A PATIENT 
TURNOVER) 


significant, while the difference between the 
milieu therapy and somatic therapy wards is 
(p < .05). The number of patients on these 
wards is too small to justify an elaborate sta- 
tistical analysis. More reliance is placed on the 
fact that the same over-all, qualitative result 
was obtained when the study was replicated 
with the same wards after a patient turnover. 
This is shown in Fig. 3: the nonpsychiatric 
control and the milieu therapy wards show a 
complex social organization, while there is 
much less interaction and social patterning on 
the somatic therapy ward. It is conceivable 
that the lack of social organization on the 
somatic therapy ward is due to a direct effect 
of the treatment. But this is not likely. An 
examination of the treatment history of each 
individual patient on the ward failed to reveal 
a relationship between type or length of 
somatic therapy and sociometric results. For 
example, the three patients with reciprocal 
choices in Fig. 3 included a patient who had 
just completed 20 electroconvulsive shocks, a 
patient getting heavy doses of a tranquilizing 
drug, and a patient who had had 50 insulin 
coma treatments. These patients can be com- 
pared with the three most isolated patients of 
whom one had just completed 24 electrocon- 
vulsive shocks, one was getting heavy doses of 
a tranquilizing drug, and another had just had 
six electroconvulsive shocks. While the results 
suggest that milieu therapy improves group 
organization on the ward, other evidence is 
needed to evaluate the long-term effects of the 
milieu therapy on the patient’s illness and 
adaptive capacities. 


SUMMARY AND CONCLUSIONS 


The study was designed to test the hypothe- 
sis that the greater the degree of mental illness 
in a patient, the more disturbed are his social 
relationships. A sociometric questionnaire was 
administered to three wards arranged along a 
continuum of degree of mental illness—a con- 
trol medical, an open psychiatric, and a locked 
psychiatric ward. 

A sociogram showed that the complexity of 
ward social organization decreased as degree 
of mental illness increased. Percentage of pa- 
tients with one or more reciprocal choice, per- 
centage of choices reciprocated, absolute num- 
ber of names known, and the absolute number 
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of choices made decreased as degree of mental 
illness increased. The percentage of social 
isolates increased in the more disturbed wards. 

The results suggest that as mental illness 
increases, sociometric choices are influenced 
less by similarities in social background varia- 
bles such as age, race, and population area of 
origin. Except for paranoid schizophrenics, 
similarities in diagnostic variables have little 
effect on sociometric choices. 

There were fewer reciprocals, number of 
cheices, etc., on negative sociometric items 
than on positive ones for all wards. Negative 
reciprocals, choices, etc., decreased as degree 
of mental illness increased but proportionaliy 
to the decrease in positive items. 

Sociograms from three additional wards 
showed a degree of social interaction and or- 
ganization on a psychiatric ward emphasizing 
milieu and group therapy which was quite 
comparable to that of a control medical ward. 
In contrast, a psychiatric ward emphasizing 
somatic therapy showed much less interaction 
and organization. After a patient turnover, 
the study was replicated with similar results. 


EDWARD J. MURRAY 


AND MELVIN COHEN 


It was concluded that as degree of mental 
illness increases, there is a decrease in social 
organization and social relationships involving 
positive or negative feelings. This process ap- 
pears to be reversed by milieu therapy. 
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A COMPARATIVE STUDY OF INDIVIDUAL, MAJORITY, AND 
GROUP JUDGMENT 


DEAN C. BARNLUND 


Northwestern University 


HE comparative quality of decisions 

made by groups and by individuals 

working alone has been tested under a 
wide variety of experimental conditions. In 
general, these studies indicate that group 
judgments are superior to individual judgments 
on certain types of intellectual problems (2, 6). 
Where experiments have employed groups 
composed of persons of different levels of abil- 
ity, however, it is not clear whether the quality 
of the decisions is due to the greater influence 
of the more capable members of the group or 
is a specific consequence of group thinking 
itself. Do groups make better decisions be- 
cause the less intelligent capitulate to the more 
intelligent members? Or are there psychological 
factors inherent in group interaction which 
produce the higher level of performance? When 
each group member possesses unique informa- 
tion or ideas it is not unreasonable to expect 
that interaction will increase the total amount 
of information and enlarge the perspective of 
the group as a whole. But what happens to the 
level of group judgments when interaction oc- 
curs among persons who are equally informed 
and talented? 

The present investigation is concerned with 
how decisions made by individuals working 
alone compare with the pooling of individual 
judgments through majority vote and with 
decisions reached through the process of group 
discussion when: (a) the membership of the 
group is homogeneous with respect to ability 
to solve the assigned problem; (5) the task is 
complex, couched in prejudicial terms, and 
involves a range of possible solutions; and (c) 
individuals and groups are permitted the same 
length of time to complete their tasks. Finally, 
the study seeks to determine some of the fac- 
tors that account for any differences observed 
in individual and group performance. 


METHOD 
Subjects 


The Ss used in this experiment were students enrolled 
in freshman courses in group discussion over a three- 


year period at Northwestern University. The members 
of eight classes were used, 174 students in all. Of these, 
143 were assigned to experimental groups, and the 
remaining 31 served as control Ss. 


Procedure 


At the first meeting of the classes, Form A or Form B 
of the “Recognition of Valid Conclusions” test was 
administered. Form A and Form B were alternated 
as the first and final measures of problem-solving 
ability throughout the experiment to reduce any biasing 
effects growing out of differences in the two forms. 
Each student was given a copy of the test and an 
answer sheet and instructed to work out his solutions 
to the 30 probiems individually. Ss were given the 50- 
minute period to complete the test items. 

Each member of the class was then ranked according 
to the total number of items he answered correctly 
on the first form of the test. Eight or nine weeks later, 
before the end of the academic quarter, experimental 
groups were created. Four or five groups were formed 
in each of the classes used in the experiment. All stu- 
dents who received the same or similar scores on the 
first test were placed together so that homogeneous 
groups were created. Experimental groups were then 
given a single answer sheet and copies of the alternate 
form of the test and instructed to reach a group decision 
on each of the 30 problems. Experimental and control Ss 
were again given 50 minutes to finish the test. Members 
of the control classes repeated the test under the original 
conditions, solving items on the alternate forms of the 
test individually. A total of 29 experimental groups 
participated in the experiment. 

The final 10 group sessions were tape recorded in 
their entirety. An analysis was made of each of the 30 
decisions reached by the 10 groups to isolate the specific 
kinds of mistakes that contributed to the majority 
of group errors. Following all group tests, discussions 
were held with the experimental Ss concerning the 
factors they felt influenced their performance as mem- 
bers of the groups. 


Problem 


Many investigators of group phenomena have ad- 
mitted difficulty in finding or constructing suitable 
instruments for testing the efficiency and accuracy of 


group decision-making. Problems, to be realistic, 
should be complicated enough so they cannot be solved 
by intuition. They should be sufficiently difficult to 
test the limits of individual and group thinking. Social 
problems normally can be solved in a variety of ways, 
and test problems should contain this same feature. 
The difference between a right and wrong decision, 
however, should be clear and demonstrable. If possible, 
problems should be presented so as to involve the total 
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personality of the individual and permit his prejudices 
to influence his judgment as they do in a majority of 
everyday problem-solving experiences. 

The instrument used in this experiment, the Bradley 
test of Formal Validity in Problem Solving, seemed 
particularly well adapted for this purpose (1). The 
first section of the test entitled, “Recognition of Valid 
Conclusions,” proved long enough and sensitive enough 
to provide data on the experimental hypothesis. The 
30 problems which make up the test consist of partially 
constructed arguments of varying degrees of difficulty. 
Two statements are given which are to be assumed to 
be materially true. The problem is to select the conclu- 
sion that follows most logically from the premises. The 
arguments cover a wide range of subjects and are 
phrased deliberately to complicate the decision for the 
reader; that is, statements involve atheists, Com- 
munists, Republicans, college professors, and other 
terms likely to prejudice judgment. An example of one 
of the problems is given below: 

Some Communists are advocates of heavy taxes; 

All advocates of heavy taxes are conservative Re- 

publicans; 

Therefore: 

Some advocates of heavy taxes are not Com- 
munists 

Some Communists are conservative Republicans 

Some conservative Republicans are Communists 

Some Communists are advocates of heavy taxes 


None 


Che validity and reliability of the instrument has been 
established. The 30 problems on each of the two forms 
include the 19 valid moods of the syllogism along with 
the 11 most common fallacies. The test has been suc- 
cessful in discriminating among college students with 
different backgrounds in logic, mathematics, and 
problem solving. Intercorrelation of the two forms 
yields a raw score “r’’ of .85 (PE, .015) and a weighted 
“” of 88 (PE, .012). Items have been carefully scaled 
in the final forms so that similar scores represent similar 


of these conclusions follows 


patterns of individual errors 
RESULTS 

Measures of the relative effectiveness of 
individual, majority, and group judgments 
were obtained from scores made on the two 
forms of the “Recognition of Valid Conclu- 
sions’’ test. 

The number of items answered correctly on 
the first form was used to set up homogeneous 
groups and to determine the level of ability 
represented by the average scores and “‘su- 
perior” scores of members of the experimental 


groups when working alone.' The relative ac- 


The 
is something of a misnomer. Experimental groups were 
made up of Ss whose initial scores differed by no more 
than a few points. In each case the “superior’’ member 
refers simply to the individual who made the highest 
individual score in the group despite its homogeneous 


“superior” member of a homogeneous group 


character 


curacy of problem solving under conditions of 
majority rule was derived from an item analy- 
sis of the individual answers of each group 
member. This “mathematical majority” indi- 
cated how the groups would have scored if 
they had pooled their opinions by secret 
ballot. Of the total of 829 decisions made by 
the experimental groups, 22 were found to be 
deadlocks. These occurred whenever a group of 
four or six Ss divided their votes equally be- 
tween right and wrong answers. The results of 
splitting these decisions evenly and from cred- 
iting all of them to the advantage of the ma- 
jority are recorded in Table 1 under “Dead- 
locks divided” and “Deadlocks credited.” The 
quality of group thinking was measured by 
computing the mean scores of experimental 
groups on the second form of the test when 
they were required to reach consensus on each 
of the test problems. 

The mean scores obtained under the various 
experimental conditions and the ¢ values they 
yield are summarized in Table 1. The average 
scores of members of the experimental groups 
working alone are not significantly larger or 
smaller than the mean of majority scores when 
the 22 deadlocks are counted as correct in 
half of the instances and incorrect in the other 
half. When all deadlocks are resolved in favor 
of the correct decision, majority rule proves to 
be superior to the average performance of the 
individual group members. The “superior” 
members of the experimental groups, on the 
other hand, did significantly better than the 
majority when deadlocks were split, and as 
well as the majority when deadlocks were 
counted as correct solutions to the problems. 

Group decisions were found to be clearly 
superior to individual decisions. As a result of 
discussion, experimental groups obtained mean 
scores that were significantly higher, at the .01 
level, than ‘“‘superior’’? members of the same 
groups were able to attain through individual 
effort. These findings also hold true when re- 
sults for Form A and Form B are analyzed 
separately. Groups whose members scored 
initially near the upper limit of the test, 28 
or 29 correct answers out of a possible 30, 
gained least from solving problems coopera- 
tively. The largest gains were made by groups 
whose initial scores were low although nearly 
all of the experimental groups, with the excep- 
tion of the highest scoring group in each class, 
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TABLE 1 
COMPARISON OF INDIVIDUAL, MAJORITY, AND GROUP SCORES ON THE “RECOGNITION OF VALID CONCLUSIONS” TEST 


Majority Decisions 
“Deadlocks divided” 
“Deadlocks credited” 
“Deadlocks divided” 
“Deadlocks credited”’ 

Group Decisions 
Mean scores of groups 


Individual Decisions 

Means of average individual scores 17.9 
18.3 
17.9 


18.3 
21.9 


Means of “superior’’ individual scores 


Individual Decisions 
Means of average individual scores 
Means of “superior”’ individual scores | 
Majority Decisions 
“Deadlocks divided” 
“Deadlocks credited” 


| Group Decisions 


Mean scores of groups 21.9 


* Significant at .05 level. 
** Significant at .01 level. 


made substantial gains as a result of group 
deliberation. Students in the lowest fifth of 
their classes as a group often rivalled the per- 
formance of the most brilliant member of the 
class working alone. In only two of the 29 
experimental groups did students working to- 
gether fail to outperform their own best 
member.” 

When majority rule is compared with group 
consensus, the results show a similar large and 
significant advantage for group decision- 
making. Crediting all deadlocks from divided 
votes reduces the size of the group advantage 
over majority decisions, but its value is still 
highly significant. 

The 31 control Ss had mean scores on the 
initial administration of the “Recognition of 
Valid Conclusions” test of 18.5. (Control Ss 
made an initial mean score of 18.8 on Form A 
and of 18.2 on Form B.) On the final test form 
their mean score was 18.7. (The final mean 
scores for control Ss were 18.9 on Form A and 
18.6 on Form B.) This difference is not statis- 
tically significant and it is safe to assume that 
differences in mean scores of the experimental 
Ss were due to the experimental variables 
rather than differences in the test forms. 

These data indicate that the members of 
homogeneous groups can achieve significantly 
better decisions by solving their problems 
cooperatively than they can through voting or 
by individual effort. Majority decisions, when 
all deadlocks can be successfully resolved, can 
produce better results than are obtained from 
the averaging of individual efforts. But in 
three out of four of the conditions observed in 
this experiment, majority decisions proved to 

2In both of these cases the groups contained indi- 
viduals who received almost perfect initial scores. 


be no better than, or inferior to, the decisions 
of individual members of the same groups. 


DISCUSSION 


The results of the first phase of this experi- 
ment need to be interpreted in the light of 
early research on _ collective judgments. 
Whether they explained the finding on statis- 
tical or psychological grounds, Watson (10), 
Gordon (3), Stroop (8), and Gurnee (4) found 
grouped judgments superior to those of the 
average individual and equal to those of the 
superior individual working alone. This con- 
clusion is not supported by our data. When 
deadlocks are resolved on the basis of statistical 
probabilities, majority decisions are found to 
be no better than those of the average member 
of homogeneous groups. 

The explanation for the difference in results 
seems to lie partly in the character of the tasks 
and partly in the methods of grouping data. 
Some of the problems used by these investiga- 
tors involve what may be called additive 
activities. Whenever individual efforts are 
additive or cumulative, the larger the group 
the greater should be the advantage from 
combining individual data. Testing the accu- 
racy of conclusions drawn from given argu- 
ments is not the same kind of problem. One 
answer simply cannot be added to another. A 
second explanation for the difference is found 
in the manner of grouping individual decisions. 
The pooling of data in previous studies com- 
bined the heterogeneous opinions of 10 to a 
100 individuals. In averaging data the greater 
the number and range of scores, the larger the 
gain from cancelling out individual errors. In 
this case, only four to six opinions from indi- 
viduals of comparable ability determined the 
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decision. Majority rule may prove a con- 
venient political device for averaging individ- 
ual preferences; but our results suggest that 
in small, homogeneous groups or committees, 
majority rule, when it precludes discussion or 
debate, is likely to be less effective than the 
personal judgment of superior members of the 
group. 

After however, 
groups produced decisions that were far su- 
perior to those of members working alone or 
through majority rule. Moreover, group de- 
cisions on the test problems were reached 
within the same period of time allotted to indi- 


discussion, experimental 


viduals. 

Several hypotheses are offered in the litera- 
ture for the high quality of group judgments. 
Watson found group decisions superior be- 
cause of the influence of the ablest member. 


In measuring the output of a group, either when work- 
ing along cooperative group-thinking lines or when the 
project permits the simple compilation of individual 
efforts, it matters little about the ability of the poorest 
or even average member of the group. The results seem 
to show primarily what the few ablest in the group have 
produced (10, pp. 333-334) 


This hypothesis, though generally tenabie, 
seems inadequate to explain the results of this 
experiment. Groups were made up of students 
whose initial performance indicated a common 
aptitude for selecting logical conclusions from 
given arguments. The grading of items on the 


“Recognition of Valid Conclusions” test is 
such that persons who get similar scores are 
likely to possess not only the same level of 
ability but similar habits of thinking. 

Another theory, suggested by Gurnee (4) 
and Thorndike (9), is that the better perform- 
ance of the group is due to the social influence 
of the more contident group members who are 
more often right than wrong. It is difficult to 
see how this factor could have played a large 
part in the results. It would seem likely that 
students with similar patterns of right and 
wrong answers would share somewhat similar 
patterns of confidence about their answers. If 
so, this factor can be minimized. 

It is necessary to go beyond these hypothe- 
ses to explain how correct solutions were 
reached by groups whose members made simi- 
lar or identical errors when working alone. The 
diagnostic discussions and the analysis of re- 


corded group sessions furnish additional clues 
to the psychological factors affecting the high 
level of group performance. 

Membership in the experimental groups pro- 
duced a higher level of interest in the successful 
completion of the task. Ss concentrated more 
intently on the assigned problems after being 
appointed to a group than they did when 
solving the problems individually. Group mem- 
bers found themselves more and more deeply 
involved as they proposed, and were forced to 
defend, their ideas. Participants identified 
with their own groups to such a degree that 
when some members became fatigued, others 
urged them to continue working. 

Membership in the experimental groups had 
an inhibiting as well as facilitating effect. 
Knowledge that one’s opinions were to be 
shared publicly made group members more 
cautious and deliberate in their own thinking. 
The necessity of explaining a conclusion forced 
many students to be more self-critical. Errors 
that might have been committed privately 
were checked before they were communicated 
to others. 

Groups had greater critical resources than did 
individuals working alone. In spite of the uni- 
form level of ability, group members saw dif- 
ferent issues and a larger number of issues 
than a single person did working alone. A 
greater number of viewpoints increased the 
group’s chances of selecting a valid one. Even 
the poorest members contributed signiticantly 
to the quality of the group product. Remarks 
that went no deeper than “I don’t understand” 
or ““That’s absurd” often saved the group from 
error by forcing others to justify their opinions 
and in so doing disprove their own conclusions. 

A more objective view of the problem resulted 


from competition between the private prejudices 


of group members. The test arguments were 
stated in loaded terms designed to make the 
choices between conclusions as difficult as 
possible. Each individual, however, brought a 
different set of values to his group. When argu- 
ments were stated so they appealed to persons 
of one persuasion, those in opposition were 
anxious to detect their error. In this way, 
liberals counteracted conservatives, Republi- 
cans offset Democrats, and “independents” 
guarded against critical lapses on the part of 
fraternity members. Groups were forced to 
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become more objective, and this, of course, 
increased their chances of drawing valid con- 
clusions. The significance of this one factor 
alone would be hard to overestimate. 

Discussion of the test items also prevented 
other incidental mistakes from occurring. 
Some groups had to check their instructions 
several times because members had different 
interpretations of them. Discussion often led 
to a clarification of terms used in the test, and, 
where logical fallacies spring from ambiguous 
terms, this may account for some of the gains. 
A number of groups formulated general prin- 
ciples as they went along to help them avoid 
repeating errors in later problems. 

What, then, prevented experimental groups 
from attaining even higher scores than they 
did? Analysis of the transcripts revealed two 
factors that together accounted for a majority 
of the group errors. The first was that group 
members agreed immediately and unanimously 
upon the wrong answer to a problem. Further 
study of the issue was then considered unneces- 
sary and wasteful. This is the same factor that 
Jenness, following F. H. Allport, refers to as 
the “impression of universality” (5). Agree- 
ment becomes the criterion of correctness. 
Maier (7) suggests that provoking arguments 
under these circumstances leads to better 
judgments. The virtue of disagreement and 
the possible function of a ““No-Man” in group 
deliberations, needs further testing. 

The second factor was that groups, when 
they reached a deadlock, were unable to use 
their differences of opinion for their own ad- 
vantage. When conflicts became intense they 
were resolved by surrender of the less aggres- 
sive members or by compromising on a third 
solution which was almost always incorrect 
but served to protect the egos of the parties to 
the controversy. Apparently disagreement 
stimulates thought up to a point; beyond that 
point, groups may lack the patience and skill 
to exploit it. 

Discussion, as a preliminary to group deci- 
sions, causes groups to examine a problem 
more thoroughly and to consider a wider 
number of solutions. It encourages individuals 
to think more carefully and in sharing opinions 
to expose the logic of their position to the 
inspection of others. Membership in a group 


produces a sense of responsibility which in- 
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tensifies and sustains effort. The biasing effect 
of private prejudice may be counteracted 
leading to a more objective view of the issues. 
The data of this study indicate that the answer 
to the question of whether group opinion is 
better than individual opinion because of the 
influence of the superior person or because of 
the discussion process itself is that discussion 
inherently contains psychological pressures 
and motivations which, if not abused, tend to 
produce superior judgments on complex intel- 
lectual problems. Individual decisions and col- 
lective judgments lack the additional ingre- 
dient supplied by interaction which permits a 
group to outperform its own members. 


SUMMARY 


The performance of individuals working 
alone, under majority rule, and as members of 
discussion groups were compared on a complex 
intellectual task. Individual judgment was 
measured by administering a test of ability to 
draw logical conclusions from given arguments. 
Individuals receiving similar scores were as- 
signed to the same experimental groups so that 
the factor of distributed ability would be re- 
duced to a minimum. The votes of members of 
the homogeneous groups were mathematically 
tallied to determine the results under condi- 
tions of majority rule. A second form of the 
test was complete.] as a group undertaking and 
the scores compared with individual and ma- 
jority scores. The results indicated that: 

1. Majority decisions, when deadlocks are 
evenly divided between right and wrong an- 
swers, are not significantly different from those 
made by the average individual and are in- 
ferior to those of the best member of the group 
working alone. 

2. Group decisions, reached through coop- 
erative deliberation, are significantly superior 
to decisions made by individual members 
working alone and to majority rule. 

The superiority of group judgments was 
found not to be a reflection of the wisdom of 
the superior member of the group but a result 
of psychological factors inherent in discussion. 
Participation in a group led to more serious 
concentration on the task and to more enthu- 
siastic individual effort. Group discussion was 
found to stimulate more careful thinking, to 
lead to a consideration of a wider range of 
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ideas, and to provoke more objective and 
critical testing of conclusions. 
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THE EFFECT OF ATTITUDE AND EXPERIENCE ON 
JUDGMENTS OF CONTROVERSIAL STATEMENTS! 


MARSHALL H. SEGALL 


Columbia University 


NSOFAR as opinion formation and change 
involve subjective judgments, several as- 
pects of adaptation-level? theory (4) 

should prove to be of value for the develop- 
ment of a psychology of opinion. Among the 
most promising conceptions incorporated in 
Helson’s unification of judgment principles are 
those of the implicit relativity of absolute 
judgment, a neutral reference point in terms 
of which incoming stimuli are evaluated, and 
the level of adaptation as a summary of pre- 
vious relevant stimulation. 

There is considerable agreement that a more 
solid basis is needed for a theory of attitudes, 
social norms, frames of reference, and the like, 
and since A-L theory has been offered as such 
(5), it seems well to assess the relevance of the 
theory to problems in this area. This study is 
focused on the effects of prior experience on 
extremity evaluation of opinion statements. 
According to the theory, absolute extremity 
judgments, like those required in the con- 
struction of a Thurstone questionnaire, are 
relative to each judge’s A-L. It follows that 
such so-called absolute judgments are in fact 
relative to the judge’s attitude, at least to the 
degree that his attitude reflects his experience 
with material similar to that being judged. In 
this regard, it is interesting to note that a rela- 
tionship between attitude and judgment of 
controversial material has recently been found 
by Hovland and Sherif (8). Thurstone’s as- 
sumption of independence of attitude and 
judgment in this setting (12), though seem- 
ingly upheld by the work of several investi- 
gators (e.g., 3, 7, 10), has now been seriously 


1 This report is based on a dissertation submitted 
to the Graduate School, Northwestern University, June, 
1957. The writer wishes to express his gratitude to 
Donald T. Campbell for his guidance and encourage- 
ment during all phases of this investigation, and to 
Donald J. Lewis, Carl P. Duncan, William A. Hunt, 
and Brenden Maher for their many contributions. 
Rudolph Schulz, Donald Butler, and Norman Miller 
also offered several helpful suggestions during the 
course of the study. 

? Hereafter abbreviated A-L. 


challenged by the Hovland and Sherif demon- 
stration of displacement of judgment away 
from the judge’s own position. Hovland and 
Sherif note that this finding is virtually de- 
manded by the well-documented principles of 
judgment which emerged over the years from 
the psychophysics laboratories. These prin- 
ciples are the foundation of A-L theory. 

In the present study, an effort was made to 
replicate the Hovland and Sherif finding with 
similar material and, in addition, to test the 
utility of postulating an underlying A-L mech- 
anism. To this end, in addition to employing 
judges known to hold different positions on the 
topic of the items to be judged, the nature of 
the experience with the items during the judg- 
ment session was systematically manipulated. 
To demonstrate the strict relevance of A-L 
processes, it seems necessary first to demon- 
strate that experimentally manipulated expe- 
rience can lead to displacement of judgment. 

According to Helson’s theory, background 
and residuals interact to establish the A-L 
which serves as the reference point determining 
the response to each focal stimulus. As each 
stimulus is judged, it then becomes part of the 
background, thereby modifying the A-L as 
effective in subsequent judgments. As Helson’s 
theory is here interpreted, variations in judges’ 
attitudes represent variations in residuals, and 
manipulation of the order of occurrence of 
items to be judged is an empirical translation 
of variation in background.* 

The study was so designed that the relative 
influences of residual and background factors 


3 These identifications of residual and background 
classes of stimulation with judge’s attitude and item 
order seem consonant with Helson’s intentions. “Al- 
though the distinction between stimulus as such and 
background stimulus is not always easy to draw, when 
an individual is instructed to respond to specific 
stimulation during or soon after other stimuli have 
acted, the latter may be regarded as background stimuli 
with respect to the focal stimuli. Beliefs, attitudes, 
traits, and cultural determinants which individuals 
bring into any concrete situation may be regarded as 
residuals in affecting specific responses” (6, p. 314). 
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in each of two judgment tasks could be deter- 
mined. In one task, absolute judgments of the 
extremity of statements about college fraterni- 
ties were required (this is essentially the 
Thurstone task); in the other, judges were 
required to state the extremity of statements in 
comparison with their own positions. The latter 
task was introduced in order to detect the 
presence of A-L phenomena when relative 
judgments are explicitly called for. 

In addition, the design of the study permits 
detection of a context effect. A-L theory pre- 
dicts that neutral items when judged in context 
with anti items will be judged as closer to the 
pro end of the attitude dimension than when 
judged in a pro items context. In the latter 
instance, judgments should be displaced to- 
ward the anti end of the dimension. 

Finally, this study provides data relevant to 
a prediction implied by A-L theory of a meas- 
urement bias inherent in the Thurstone ap- 
proach. If the range of positions spanned by 
the questionnaire items differs sufficiently from 
the respondent’s own position, so that his A-L 
is shifted, his responses to those items would be 
modified, thereby producing a change in atti- 
tude score. The second task employed here was 
arranged so that a change of this nature, were 
it to occur, could be detected. Whether this 
kind of change in score reflects a change in 
attitude is a moot question; a change in score 
due only to questionnaire form and content is 
of sufficient methodological concern to warrant 
study. 


METHOD 


4 factorial design was employed, incorporating three 
groups of judges knewn to differ in attitude toward 
college fraternities, three presentation of 
statements about fraternities, two judgment tasks, and 


orders ot 


an additional variation in manner of item presentation 
involving the inclusion or noninclusion of “transition” 
items 

Of the toal 180 Ss, 60 were classified “pro-frater 
and 60 “anti-fraternity.” In 
order pro-fraternity statements 
preceded 24 anti-fraternity statements; in a second 
condition that order was reversed. In a third, or control, 
condition, both kinds of statements were intermingled. 
In all three conditions, 16 neutral items were inter- 
spersed among the pro and anti statements. In addition, 
for one half of the Ss a few additional items were in- 
cluded after the thirty-second item to provide a smooth 
transition phase between halves of the total item com- 
plement. Finally, the task variable was introduced by 
instructing one half of the Ss to make absolute judg- 


nity,’ 60 “moderate,” 


one condition, 24 


ments, while the other half were told to indicate the 
statements that best represented their own positions 
and to judge the extremity of all other statements in 
terms of those positions. These variations, to be dis- 
cussed in detail in the succeeding sections, resulted in a 
36-cell experimental design 


Subjects 


Ss were drawn from a pool of 227 introductory psy 
chology students who had completed a 120-item multi- 
topic attitude questionnaire several weeks before the 
experiment proper began. The questionnaire was ad- 
ministered in class by the regular instructor and was in 
no way identified as part of this experiment. That the 
questionnaire was never mentioned by the Ss indicates 
that they did not link the two events. On the basis of 
their responses to seven imbedded items dealing with 
fraternities, 70, 89, and 68 respondents respectively were 
classified pro, moderate, and anti. One month after the 
questionnaire administration, daily circulation of 
volunteer appointment sheets was begun. This method 
of recruiting Ss is standard, so that volunteering for 
this study was, from the students’ point of view, not 
unusual. However, unknown to them, E possessed 
classification data, and as Ss volunteered they were 
assigned to the appropriate attitude group. 

Within each attitude group, Ss were assigned in 
rotation to one of 12 subgroups until each of the 36 
cells in the design matrix contained five Ss. Efforts 
were made to insure that all cells were filled at the same 
rate. Occasionally it was necessary to telephone po- 
tential Ss in a given attitude classification, but selection 
biases were minimized by obtaining in this manner an 
approximately equal number from each classification. 


Materials 


One hundred and fifty statements collected from 
various sources were categorized by ten psychology 
graduate students along a seven-category continuum 
ranging from extremely anti (Category 1) to extremely 
pro (Category 7). Sixteen neutral items (Category 4) 
and ten items each at the other six points were retained 
for use in the sorting tasks. (Henceforth, the category 
values assigned by the graduate students will be referred 
toas “true values.”) Each of the 76 statements was typed 
on 3 x 5 cards to facilitate the variation in order of 
presentation. Each S was provided with a set of at least 
64 cards arranged in a predetermined order. Thus, for 
those Ss in the con-pro condition, the first 32 cards 
contained statements that were con or neutral (true 
values 1 through 4) and the last 32, pro or neutral 
(4 through 7). The reverse was true for the pro-con 
condition. In the remaining order condition, the whole 
range of values was represented in both the first and 
las. 32 items. 

A seven-compartment card receptacle with narrow 
slits along the top was employed to make it impossible 
for S to recover a card once sorted, or to see how many 
cards he had assigned to each compartment. The 
compartments were numbered from 1 to 7 and labeled 
appropriately for the particular sorting task being per 
formed. The numbers were fixed, but the labels indi- 
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JUDGMENTS OF CONTROVERSIAL STATEMENTS 


cating the meaning of each compartment were posted 
just prior to each S’s appearance at the laboratory 


Tasks 

Each S was instructed to perform one of the two 
card-sorting tasks. Random selection determined the 
pairing of Ss and tasks, and both tasks were run con- 
currently to avoid confounding the task variable with 
extra-experimental events. Extremity judgments were 
required in both tasks. In the absolute judgment task, 
Category 4 was defined as objectively neutral, while in 
the relative judgment task, that category was reserved 
for statements which S saw as expressing his own posi- 
tion. In the former task, the six other categories were 
labeled: [1] extremely anti-fraternity, |2| moderately 
anti-fraternity, [3] slightly anti-fraternity, etc., through 
to [7] extremely pro-fraternity. In the latter task, those 
categories were labeled as follows: [1] very much too 
unfavorable toward fraternities to represent my feelings, 
etc., through to [7] very much too favorable . . . to repre- 
sent my feelings. It should be noted that the latter 
task is similar to that performed by respondents to an 
attitude questionnaire, with the additional require- 
ment that rejected items be arrayed along the ex- 
tremity continuum. 


Procedure 


Order of presentation. A basic packet of 64 cards in the 
pro-con order was arranged as follows: The first 32 
cards consisted of eight items each of true values 4, 5, 6, 
and 7, and the last 32 were made up of eight items each 
of values 1, 2, 3, and 4. Within each group of 32, items 
were arranged so that each true value was represented 
once in every successive four items. Within each set of 
four, the values represented occurred in any one of 24 
orders, the permutations of four values taken four at a 
time. These precautions were taken to insure that the 
only systematic true value variation was between sets 
of 32, with only random variation within. An additional 
control, for differential impact of individual statements, 
was employed by selecting arbitrarily for each S 48 
pro sad con statements from the available 60. The ten 
available items in each of the six pro and con categories 
were shuffled, and the eight topmost cards were used 
in the order determined by the shuffle. The two remain- 
ing curds were omitted from théybasic packet of the S 
in question. Thus, all Ss in the pro-con condition 
received 32 cards of values 4 through 7 followed by 32 
cards of values 1 though 4, but each S in that group 
actually received a unique order of statements. 

The general format prevailing in the con-pro 
condition was the direct reverse of that used in the 
pro-con condition, with the same controls in effect. 
A similarly unique order of items was achieved for 
Ss in the control condition, but here, all true values 
were represented throughout each packet. Thus, every 
successive arrangement of eight items included one 
statement each of values 1, 2, 3, 5, 6, and 7 and two 
statements of value 4. It should be clear that all Ss in 
each order condition ultimately were exposed to a single 
packet of at least 64 items drawn from the seven 
stimulus categories and that no two Ss received the 
same statements in identical orders 


TABLE 1 
SUMMARY OF ANALYSIS OF VARIANCE OF MEAN 
ABSOLUTE JUDGMENTS 


Source df | MS 


Judge’s attitude 

Item order 

Transition 

Interaction: att. 
order 

Interaction: att. X 
trans. 

Interaction: order X 
trans. 

Interact.: att. X order 
x trans. 

Replications 


Transition. Ninety Ss, half of the total complement 
in each order condition. received six statements in 
addition to the crucial 64 received by all Ss. These 
items, two each of values 3 and 5 and one each of values 
2 and 6, were inserted between the first and last 32 
essential items. These transition items were drawn from 
the cards omitted as a result of the shuffles described 
above, and were themselves shuffled to determine the 
six to be inserted. They were included for only half the 
Ss to permit an evaluation of the effectiveness of transi- 
tion items in masking the change in item values occur- 
ring between pro and con halves in the two experimental 
conditions. It might be expected that a transition 
phase, by minimizing S’s awareness of the value differ- 
ence across halves, would provide more favorable 
conditions for the operation of context-contrast effects. 
In a recent study of context effects on a judgment of 
musica] tones (2), a number of Ss responded to a con- 
text shift with an apparent assimilation error, an 
effect that might be explained by S’s awareness of 
E’s purpose. 

Neutral items. In order to determine whether shifts 
in judgments of common stimuli could be effected by 
context changes, neutral items were included in both 
halves of all packets. In addition, for 90 Ss the neutral 
items in each half were identical, while for the other 
90 Ss a different set of eight occurred in each half. This 
additional minor variation was introduced in order 
to determine whether identical items would be as 
susceptible to context shifts as similar, but not identical, 
statements. 

Data collection. Following the steps outlined above, 
a packet of cards was prepared for each S just prior to 
his arrival. Ss were run individually. Instructions‘ were 
read aloud by £, then S was given the packet to sort. 
Following his departure, results of his sort were re- 
corded, the container was prepared for the next S, and 
a new packet was assembled. This sequence required 
about 20 minutes per S. The data were collected during 
six weeks in the winter of 1956. 


‘Instructions are reproduced in their entirety else- 
where (11). The statements employed are also available 
there 
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RESULTS 


Results obtained for each task will be con- 
sidered separately. 


Absolute Judgment Task 


Mean category score. Hovland and Sherif (8) 
found that pro-Negro Ss, when judging items 
pertaining to Negroes, tended to place an over- 
abundance of items in the extremely anti cate- 
gory. A tendency on the part of anti-Negro Ss 
to concentrate items at the pro end of the 
continuum was also noted. This relation be- 
tween attitude and judgment may be viewed 
as a manifestation of response displacement 
away from A-L. In the present study the same 
mechanism would be expected. It would be pre- 
dicted that pro and anti judges would displace 
their judgments in opposite directions, with a 
higher mean response for the anti judges (due 
to relatively low residuals), a lower mean for 
pro judges (relatively high residuals), and an 
intermediate mean for moderate judges. In 
addition, as a result of the establishment of 
different backgrounds via the order manipula- 
tion, differences should obtain along the order 
dimension. Judges in the pro-con condition 
should displace their judgments in the anti 
direction, while judges in the con—pro condition 
should do the opposite, reflecting A-Ls on the 
pro and anti side of neutral. To test these pre- 
dictions, a 3 X 3 X 2 analysis of mean cate- 
gory scores was performed. Both major vari- 
ables, judge’s attitude and item order, resulted 
in significant F ratios, when tested against the 
replication mean square. It should be noted in 
Table 1, however, that the level of significance 
for the order effect is higher than that for the 
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TABLE 2 
SuMMARY OF ¢ Tests, Pro—Con vs. Con—Pro 


Mean responses to stimuli from each true 
value category, absolute judgment 


3 5 6 | 7 


1 2 
Xp-c | 1.11 | 1.61 | 2.54 | 5.13 | 6.07 | 6.54 
Xe-p | 1.29 | 1.96 | 3.03 | 5.37 | 6.26 | 6.77 
sy-y | .066 .117/ .084) .070) .101| .126 
1° 2.73 | 2.99 | 5.83 | 3.43 | 1.88 | 1.83 
p < .005| < .005| < .005) < .005|< .05 |< .05 
d= 38. as . 


effect of judge’s attitude, leading to a relative 
lack of confidence in the reliability of the atti- 
tude effect. Nonetheless, differences along 
both major dimensions were in the predicted 
direction. 

Effects on stimuli of each true value. Figure 1 
shows the effect of item order on mean judg- 
ments of each class of items separately. Ss ex- 
posed first to pro statements judged anti 
statements as more anti than did Ss judging 
these statements at the outset. Conversely, 
pro statements were judged as more pro by Ss 
with prior exposure to anti statements than 
by Ss who judged pro statements first. Ss in 
the control group made intermediate judg- 
ments. Differences between mean judgments 
of every nonneutral class of items were shown 
to be significant by /¢ tests summarized in 
Table 2. (An analysis of responses to neutral 
items is discussed separately below.) 

As shown in Fig. 2, pro, moderate, and anti 
judges made virtually identical judgments to 
stimuli throughout the range. No significant 
differences obtained for any stimulus class. 

Neutral items. A 3 X 3 analysis of variance 
of mean judgments of neutral items occurring 
in the first half of each packet revealed that 
the context in which those items were judged 
was, as predicted, a significant determinant. 
Judge’s attitude was not. This analysis is 
summarized in Table 3. However, the context 
prevailing in the second half of the judgment 
session did not affect judgment of neutral items 
as it did during the first half. The mean judg- 
ment of the pro—con group was the same during 
both halves; the mean judgment of the con—pro 

5Tests based upon the components of variance 


model (9, p. 332) lead to the conclusion that no signifi- 


cant difference was produced by the attitude variable. 
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Fic. 2. MEAN ABSOLUTE JUDGMENT OF EACH 
Irem Crass By Ss rn Eacu ATTITUDE 
CLASSIFICATION 


group shifted in the pro direction in face of the 
pro context, just contrary to the A-L predic- 
tion. Thus, the intrasession shifts predicted 
for neutral items judgment did not occur. 


Relative Judgment Task 


Mean category score. Analysis of data col- 
lected in this task proceeded as above. Evi- 
dence of a general displacement of judgment 
was first sought. An analysis of variance of 
mean category score (see Table 4) showed 
that in this task, item order was not related to 
the Ss’ treatment of items. On the other hand, 
and again in contrast with the effects noted for 
the absolute judgment, the judges’ attitudes 
were to a large extent responsible for the cate- 
gury assignments. However, this fact is less 
than startling, for in this task each S was 
explicitly judging the items in relation to his 
own position. The relationship between atti- 
tude and judgment here indicates simply that 
instructions were followed. A more interesting 
finding is the absence of an order effect, par- 
ticularly since one was found for absolute 
judgment. When the mean responses to each 
class of stimuli were arranged separately for 
each order condition and for each attitude 
group, / tests indicated significant differences 
only as a function of attitude. This relationship 
between judge’s attitude and relative judgment 
holds as well for neutral items. 

Neutral items. In this task, the data showing 
the effect of order on neutral items present a 
complex picture. Mean judgments during the 
first half under both experimental conditions 


TABLE 3 


SUMMARY OF ANALYSIS OF VARIANCE OF NEUTRAL 
ITEM ABSOLUTE JUDGMENTS DuRING PuHaseE 1 





Source las|Ms| F | » 

‘“ ———— ae | fe 
Judge’s attitude 2 | .07 
Item order | 2] .46| 5.05 | <.01 
Interaction: attitude X 4| .03 

order | 
Replications 81 .09 

TABLE 4 


SUMMARY OF ANALYSIS OF VARIANCE OF MEAN 
RELATIVE JUDGMENTS 





Source df | MS F p 
Judge’s attitude 2 | 3.40 | 37.44 | <.001 
Item order 2 .02 
Transition 1 .02 
Interaction: 4 .16 1.79 

att. X order 
Interaction: 2 15 1.65 | 

att. X trans. 
Interaction: 2 19 2.09 | 

order X trans. 
Interaction: att. X 4 28 3.06 | <.05 

order X trans. 
Replications 72 .09 

TABLE 5 


SUMMARY OF ANALYSIS OF VARIANCE 
oF ATTITUDE SCORES 


Source df MS F p 
Judge’s attitude 2 | 6.260 | 37.04 | <.01 
Item order 2 | 0.345 | 2.04 n.s 
Interaction: att. X 4 | 0.125 

order 
Replications 81 | 0.169 | 


were not significantly different. Thus, the pre- 
dicted context effects did not occur. During the 
second half, under con and pro contexts, re- 
spectively, mean neutral item judgments were 
3.35 and 3.58, yielding a difference that ap- 
proaches significance, but this difference is in 
the “‘wrong”’ direction. This latter fact suggests 
a context-assimilation effect during the second 
phase of judgment, but interpretation of these 
data is hindered by the fact that the control 
group means in both phases were higher than 
those of either experimental groups. The rela- 
tionship between context and neutral item 
judgment in this task remains undefined. 
Effect of item order on measured attitude. It 
was suggested that item order could alter 
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attitude questionnaire scores as a manifesta- 
tion of a shifting A-L. This suggestion was 
tested by an analysis of variance of attitude 
scores (mean value of items assigned to Cate- 
gory 4, “my own attitude’). The analysis, 
summarized in Table 5, indicates no significant 
difference in scores as a function of item order. 

On the other hand, there were differences in 
score along the judge’s attitude dimension, a 
finding which indicates that the attitude classi- 
fication procedure applied in this study was 
reasonably valid. However, it is important to 
note the small range of average scores of the 
three attitude groups; the means for the pro, 
moderate, and anti Ss respectively were 4.89, 
4.52, and 3.89. 

DISCUSSION 
Absolute Judgment 

The principal finding of this study is that the 
prior presentation of attitude statements cov- 
ering only part of the total stimulus dimension 
biased the absolute judgments of all state- 
ments. As suggested in the introduction to this 
paper, an effect of this nature is understand- 
able in terms of the establishment of differen- 
tial A-Ls resulting from different backgrounds. 
In addition, A-L theory was interpreted to 
predict that different residual factors, as re- 
flected in the judge’s attitude, would also bias 
the absolute judgment, but the data here 
reported indicate that attitude has little effect. 
It would be tempting, then, to conclude that 
residual factors play only a minor role in ab- 
solute judgment, except that the judgment 
literature contains several demonstrations to 
the contrary. In addition to the Hovland and 
Sherif paper (8), Tresselt’s study (13) of 
weight judgments by athletic weightlifters and 
professional watchmakers is interpretable as 
evidence that past experience with weights 
outside the laboratory influences judgment in 
the laboratory. To account for the discrepancy 
between the present findings and those re- 
ported by other investigators, it may be well 
to consider the possibility that the residual 
factor was not adequately varied in the present 
case. 

For example, the judges in the Hovland and 
Sherif study included conservative Southerners 
and liberal Northerners; in the present study, 
judges were all students at a university that 


supports the fraternity system. The range of 
relevant experiences among the latter was, in 
all likelihood, far less broad than that among 
the Hovland and Sherif judges. Indeed, it was 
shown above that the range of attitudes of the 
judges used here was quite small, and the 
failure to find judgment differences related to 
this variable is reasonably attributable to that 
narrow range. 

Turning now to the neutral items absolute 
judgment, it should be stressed that the finding 
of a context-contrast effect during the first 
half of the session is in accord both with A-L 
theory and the scores of psychophysical studies 
that it synthesizes. Less easily incorporated 
into this framework, however, is the fact that 
similar context effects did not occur during the 
second half of the session. At least three inter- 
pretations of the data seem reasonable. (a) 
Once having assigned eight neutral items to 
categories during Phase 1, Ss assigned similar 
(and in some cases identical) neutral items to 
the same categories during Phase 2. This hy- 
pothesis is labeled commitment. (b) S reacted 
to the contrasting context operative during 
Phase 2 by assimilating his judgments to that 
context. This is a context-assimilation hypothe- 
sis. (c) The experience during Phase 1 alone 
determined the A-L, so that judgments of 
neutral items during both phases were relative 
to that A-L and unaffected by the stimuli oc- 
curring in Phase 2. This is a primacy hypothe- 
sis. 

Some evidence for the commitment hypothe- 
sis may be drawn from the fact that those Ss 
who judged two identical sets of neutral items 
did indeed provide almost identical judgments 
during both phases. On the other hand, the 
data for Ss judging different sets of neutral 
items are more consonant with an assimilation 
hypothesis. These Ss in both experimental 
groups differed as predicted in Phase 1, but 
then their judgments diverged rather than 
converged in Phase 2 when the contexts were 
reversed. Examination of individual protocols 
revealed that about half of these Ss made 
judgments that could be described as assimila- 
tive. In two studies by Campbell et al. (1, 2), 
in which the prevailing phenomenon was a 
contrast effect in response to a shift in context, 
a significant number of judges were assimi- 
lators. It will be recalled that an attempt was 
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made here to seek an explanation for assimila- 
tion in terms of S’s awareness of £’s intent in 
shifting the context. By partially masking that 
shift for some Ss it was felt that assimilation 
would be partially eliminated. If it did occur, 
it would be expected more frequently among 
Ss for whom the shift was abrupt and probably 
obvious. This was not the case. Assimilation 
could not be linked with any known stimulus or 
subject variable, thus leaving the matter of 
assimilation unexplained. 

The primacy hypothesis admits that the 
data may be described in terms of assimilation, 
but suggests that they be “explained” in terms 
of contrast. This hypothesis requires that A-L 
theory be modified to allow temporal weighting 
of stimuli, with those occurring earlier in a 
contiguous series contributing more heavily 
to the A-L. A within-session primacy hypothe- 
sis coupled with a between-sessions recency 
notion would account for all the facts noted 
in the absolute judgment part of this study. 
These paired notions first lead to the prediction 
that an A-L built up over a long series of 
experiences in the relatively distant past would 
be easily modified at the outset of a new and 
distinguishable judgment experience. Aside 
from the support provided by the present 
demonstration that judge’s attitude played a 
minor role in influencing judgment, the fact 
that weightlifters and watchmakers in Tres- 
selt’s study converged upon a common judg- 
ment scale provides additional evidence. Fi- 
nally, the assumption of a within-session 
primacy effect accounts for the apparent as- 
similation found here in Phase 2, but since the 
hypothesis was designed to account for these 
data, it will require future independent con- 
firmation. 

If two studies by Campbell et al. (1, 2) are 
jointly considered, there is further reason for 
correcting that part of A-L theory which 
posits an equal contribution from all stimuli. 
In one study (2), judgments of musical pitch 
revealed a recency effect; in the other (1), in 
which judgments of degree of disturbance indi- 
cated by schizophrenics’ definitions were stud- 
ied, a primacy effect was found. This apparent 
contradiction would lead to the hypothesis that 
the relative likelihood of finding primacy or 
recency effects depends upon the nature of the 
material being judged. Judgment of verbal 


materials, which presumably involves central 
processes more than does judgment of musical 
pitch, may be subject to recency effects. At 
any rate, an assumption of equal weights ap- 
pears to be an oversimplification. 


Relative Judgment 


The data emerging from this part of the 
study require little amplification. As a result of 
instructions to judge the statements in relation 
to one’s own attitude, judgments were free 
from the context influences found for absolute 
judgment. In a real sense, the reference point 
provided by “my own attitude” replaced the 
A-L. Helson (5) has also reported that when 
judgments are in terms of a comparison be- 
tween variables and a standard, the A-L is 
largely determined by the standard. 


SUMMARY AND CONCLUSIONS 


An interpretation of adaptation-level theory 
suggests that judgment of controversial state- 
ments is determined by the judge’s relevant 
stimulus history. Among the implications of 
this suggestion are (a) a dependence of abso- 
lute extremity judgments upon the attitude of 
the judge, and (6) a potential distortion ot 
attitude scores due to the item order on a 
questionnaire. 

The object of this study was to determine 
some of the relationships among attitude, ex- 
perience, and judgment of attitude statements. 
Absolute judgments of statements about fra- 
ternities were made by 90 college students. 
Another 90 were instructed to judge the state- 
ments in relation to their own attitudes toward 
fraternities. Judges were classified in terms of 
their attitudes, measured about one month 
prior to the experiment proper, and were as- 
signed so that all classifications were equally 
represented in both judgment tasks. Experi- 
ence within each task was varied by presenting 
two blocks of items, each covering half the 
stimulus range, in one or the other order. A 
factorial design incorporating judge’s attitude 
and item order was employed. 

In the absolute judgment task, the order 
variation resulted in significant contrast effects, 
while judge’s attitude had little influence. In 
the relative judgment task, responses were re- 
lated to attitude, but not to item order. Atti- 
tude scores were not affected by variation in 
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item order. In general, findings were consistent 
with adaptation-level theory, provided that 
experiences are weighted differentially ac- 
cording to (a) their degree of remoteness in 
time, and (6) the nature of the judgment task. 


REFERENCES 

1. CampsBe.t, D. T., Hunt, W. A., & Lewis, NAN A. 
The effects of assimilation and contrast in 
judgments of clinical materials. A mer. J. Psychol., 
1957, 70, 347-360 
AMPBELL, D. T., Lewis, NAN A., & Hunt, W. A. 
Context effects found with a judgmental lan- 
guage that is absolute, extensive, and extra- 
experimentally anchored. Unpublished manu- 
script, Northwestern Univer., 1957. 

Fercuson, L. W. The influence of individual 

attitudes on construction of an attitude scale. 
J. soc. Psychol., 1935, 6, 115-117. 

. Hetson, H. Adaptation-level as frame of reference 
for prediction of psycho-physical data. Amer. J. 
Psychol., 1947, 60, 1-29. 

. Hetson, H. Adaptation-level as a basis for a 
quantitative theory of frames of reference 
Psychol. Rev., 1948, 66, 297-313. 

Hewson, H., Brake, R. R., Mouton, Jane S., & 
OtmsTEAD, J. A. Attitudes as adjustments to 


SEGALL 


stimulus, background, and residual factors. J. 
abnorm. soc. Psychol., 1956, 62, 314-322. 

. Hunxtey, E. D. The influence of individual opinion 
on construction of an attitude scale. J. soc. 
Psychol., 1932, 3, 283-296. 


. Hovianp, C. L., & SHerrr, M. Judgmental phe- 


nomena and scales of attitude measurement: 
Item displacement in Thurstone scales. J. 
abnorm. soc. Psychol., 1952, 47, 822-832. 

. McNemar, Q. Psychological statistics. (2nd ed.) 
New York: Wiley, 1955. P. 332. 

. Prytner, R., & Fortano, G. The influence of 
attitude on the scaling of attitude items. J. soc. 
Psychol., 1937, 8, 39-45. 

. SEGALL, M. H. Attitude and adaptation-level: 
The effect of experience on extremity judgments 
and on expression of opinion. Unpublished 
doctoral dissertation, Northwestern Univer., 
1957. 

. Taurstone, L. L. The measurement of social 
attitudes. J. abnorm. soc. Psychol., 1931, 26, 
249-269. 

. Tressett, M. E. The effect of the experience of 
contrasted groups upon the formation of a new 
scale of judgment. J. soc. Psychol., 1948, 27, 
209-216. 


Received September 12, 1957. 





nts 
ed 


ial 


26, 


of 
ew 


a7, 


INFLUENCE OF FOUR TYPES OF DATA ON DIAGNOSTIC 
CONCEPTUALIZATION IN PSYCHOLOGICAL TESTING 


WILLIAM Ff. SOSKIN!? 


Joint Commission on Mental Illness and Health, Cambridge, Massachusetts 


ECENT reviews of the literature on 
diagnostic testing reflect a shift in the 
past decade from the search for more 

subtle and more “revealing” instruments to 
more intensive study of the behavior of ap- 
praisors in the diagnostic situation. 

The present research is one of a series inves- 
tigating the influence of different types of data 
on the conceptualizing process in diagnostic 
situations. Its objective is to study the differ- 
ential influence of four types of data on the 
judgments of individuals as reflected in their 
responses on a criterion test consisting of a 
series of multiple-choice items pertaining to the 
known life experiences, attitudes, etc. of a 
subject called David. The sets of data utilized 
were selected in part to study the contribu- 
tion of certain types of information to clinical 
diagnosis and appraisal. 


METHOD 


Criterion Test 


The task posed for the several groups of judges was 
that of providing correct answers to three types of 
questions, some dealing with David’s current interests 
and attitudes, others dealing with certain of his “char- 
acteristic” behavior patterns, and still others dealing 
with his responses to specific recent life situations. 
These questions were presented in the form of four- 
alternative multiple-choice items, each set of alterna 
tives being preceded by a statement providing perti- 
nent facts for, and setting the context of, the question. 
Following is an example: 

On one occasion two foreign students who had com- 

pleted their studies and were leaving the school, 

arranged a small party for their friends. Some mild 

spirits were available to those interested, although 

as nearly as can be ascertained nothing stronger 

than wine was served. In this school, opinion is di- 

vided on the matter of social drinking. Some stu 

dents object strongly, others are more tolerant of 
moderate social drinking. Members of both groups 
were present at the party 

a. At first, David was a little uncomfortable, but 


! For invaluable assistance in carrying out various 
phases of this study the writer is deeply indebted to 
Sarah Counts, D. W. Fiske, Irving Leiden, Joanne 
Powers, Rae Shifrin, Saul Siegel, and Alvin Winder 

2 This study was carried out while the author was 
at the University of Chicago 


he joined in, swapped stories with the rest, and 

drank. 

b. After a couple of drinks he began a rather un- 

convincing act of being slightly intoxicated, began 

to slur his speech, walk uncertainly, etc. 

c. He conspicuously refused a proffered drink, and 

shortly after reprimanded the host for serving liquor 

to such a gathering. 

d. He remained throughout the party, but made 

careful note of the persons who accepted drinks and 

those who told questionable jokes and stories, and 
subsequently dropped them from his list of social 
acquaintances. 

The material for these items came principally from 
three sources: interviews with David, interviews with 
his wife and other associates, and direct observation 
by members of the research staff. David participated 
in a total of approximately nine hours of personal 
interviews with the investigator. During these sessions, 
which were conducted as relatively unstructured 
history-taking interviews, he was encouraged to talk 
freely about his marriage, his childhood, his present 
work and aspirations for the future, etc. Aside from 
occasional questions setting a new topic or seeking 
further elaborations, there was minimal intrusion by 
the interviewer. These sessions and all other interviews 
in this study were recorded on wire for subsequent 
reference. 

A second, though no less important, source of in- 
formation was David's associates—his wife and a num- 
ber of close acquaintances, both male and female. 
Upon request, David provided the names of a number 
of acquaintances with whom he was in frequent con- 
tact, persons whom he had no objections to our inter- 
viewing. From this list, seven or eight persons were 
interviewed, some only once or twice, several others 
three or four times over a two-month period. Each of 
the interviewees was given a careful explanation of 
the nature of the project and was apprised of the fact 
that David had given his permission for such inter- 
views to take place. (David himself did not know 
which persons were to be called for interviews.) These 
interviews focused largely on current events in David’s 
life 

David's wife was interviewed twice, for a total of 
approximately three hours. Her interviews were 
moderately structured, since the main objective here 
was to gather indirectly information that might cor 
roborate or qualify statements made by her husband. 

Direct observation yielded relatively little useful 
information because the periods of observation were 
few and brief and because there was no way of obtain- 
ing corroboration of the single observer’s reports. Yet, 
it was possible by this method to obtain clues to other 
appropriate item material. 

From this collection of observational notes and 
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recordings, an initial collection of 54 items was con- 
structed. The present analysis, however, deals with 
Eight items were dropped because they dif 
fered slightly in two versions of the test, and nine 
others were omitted because all persons in each of the 
four groups taking the test failed to choose the correct 
alternative. Consequently, they thought either 
too difficult or patently misleading in structure. In 
assembling the test, some attempt was made to include 
items pertaining to a broad range of behaviors; how 
ever, there was no conscious or planned attempt to 
include equal numbers of items indicative of good 
an outcome which, 


only 37 


were 


adjustment and poor adjustment 
though subsequently important, must be regarded as 
somewhat fortuitous 


Critical Data 


The designation “critical data” refers to the differ- 
ent types of specialized information provided to the 
several judging groups. Four different classes of data 
were considered: a collection of biographical facts, 
most of them provided in the context of the criterion 
test items; observations of the subject in a series of 
role-playing situations, the subject’s Rorschach proto- 
col, and a battery of psychological tests 

Role-playing situations. Two weeks prior to the day 
on which David participated in the role-playing ses- 
sions, a group of graduate students participating in the 
study—five of whom were later to serve as judges in 
the role-playing session the criterion test 
and designed nine situations which they considered as 
likely to elicit information pertinent to the questions 
raised in the criterion test. None of the situations was 
“tailored” to particular items; rather, each was designed 
to provide information from which a variety of in- 
Two of the situations were 


examined 


ferences might be drawn 
small-group discussions (e.g., David as a new, young 
superintendent of a boys’ correctional school discussed 
school policy with staff The re 
maining seven situations involved only the subject and 


veteran members 
a standard role-player. In four of these seven situations 
the standard role-player was a male, in three, a female. 
In the four all-male situations, David’s relationship to 
the other person was varied so that once he was a sub 
ordinate (a news reporter being pressed by the editor 
to meet a deadline), 
(the young and inexperienced institutional superin 
tendent talking to a long-time staff member with 
strong political connections), and twice a peer (a po 
tential client discussing with an architect plans for 
buying property and building a house, and discussing 
philosophies of life with a carefree artist whom he met 
In the male-female situations he 


once in an ambiguous position 


at a cocktail party 
was once a peer (to the wife of a friend who seeks advice 


on a marital problem), once a superior (to a young 


teacher whose behavior both in school and out has 
caused “some talk’ in the small community where he 
is superintendent of schools), and once in a quasi- 
dependent position (that of a convalescing patient 
with an oversolicitous, mildly dominating nurse) 

In all, David interacted with five different people, 
three women and two men. Two of the women were 
five to seven years older than David, who was about 
26 years of age, and one was about three years younger 
One of the men was about six years older than David, 
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the other about his own age. These standard role- 
players were picked for their ability to adapt easily to 
these particular roles. All of them were instructed as 
to the general character of the situation they were to 
try to create, but were leit free to implement these 
directions in their own fashion as the situation devel- 
oped. The role-players were given two opportunities 
to run through their roles prior to the day on which 
the sessions were held. 

Psychological tests. Prior to the beginning of the 
history-taking interviews, a day or two after David 
had been given an account of the nature of the study 
and had agreed to participate, he was given a battery 
of psychological tests, including the Wechsler-Bellevue, 
Rorschach, TAT, a sentence completion test, a picture- 
drawing test (man, woman, building), the Allport-Ver- 
non Scale of Values, the Guilford-Martin inventories 
(GAMIN and STDCR), and a 60-item word-asso 
ciation task. The Rorschach and TAT responses as 
well as the inquiry on the former were wire-recorded 
and transcribed. Judges using the Rorschach received 
hectographed copies of the original responses and in- 
quiry, together with an accurately marked location 
chart and a brief description of the subject’s behavior 
while taking the test. Judges using the entire battery 
received these same materials plus duplicate copies of 
the other test records and photostatic copies of the 
picture-drawings 

Biographical data. All subjects were told David's 
age, race, religion, marital and educational status, and 
present occupation. In addition, wherever relevant, 
biographical material was contained in the stems of 
the criterion test items. Hence, all judges knew certain 
salient facts about his military history, his career ob 
jectives, his extracurricular interests and accomplish- 
ments in college, his financial problems, etc. 


Judging Groups 


Role-play judges. The persons who observed David 
in role-playing situations (RP judges) included five 
advanced graduate students in clinical psychology who 
had been participating in a seminar on the analysis of 
role-playing behavior, and four psychiatrists, none of 
whom had any previous experience with role-playing 
situations in this type of setting 

Rorschach judges. Seven persons (Ror. judges), five 
of whom were generally regarded as experienced 
Rorschach diagnosticians and who function profes 
sionally in that capacity, evaluated David’s Rorschach. 
Several of them were “authorities” in the sense that 
in addition to years of clinical experience with the 
instrument, they had published on and taught ad 
vanced courses on Rorschach testing. The two less 
experienced persons were advanced psychological 
interns, functioning in the setting of a mental institu- 
tion 

Test battery judges. The group concerned with the 
test battery (TB judges), consisted of 39 graduate 
students at various stages of training in clinical psy- 
chology. All were second- or third-year interns in the 
Veterans Administration clinical psychology training 
program. 

Control groups. Two control groups were utilized. 
The first (““Guess” judges) was a group of 12 psycholo- 
gists enrolled in an advanced Rorschach workshop. 
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These judges studied the criterion test and made their 
item selections solely on the basis of the related facts 
reported in the test items. 

A second group consisted of 21 student nurses. For 
this group a special version of the test was prepared in 
which all items were identical to the standard criterion 
test except that a different man’s name was used in each 
item. The nurses were told that the items constituted 
a test of their knowledge of male behavior patterns 
and were based on the actual behaviors of a group of 
white, married (no children) theology students be- 
tween the ages of 25-27, the group from which David 
was drawn. By this means it was possible to neutralize 
the cumulative biographical information about David 
otherwise available in the criterion test. The nurses 
were asked to study each item and to try to guess 
which behavior might be most probable for a man with 
the characteristics indicated. 


Procedure 


At intervals varying from two weeks to three days 
before receiving the critical data, Role-play and 
Rorschach judges were given a copy of the criterion 
test and were asked to answer the items by using what- 
ever cues they could obtain from the biographical data 
available in them. The objectives of this assignment 
were twofold: (a) to acquaint judges with the exact 
types of questions they would be required to answer 
subsequently on the basis of the critical data, and (5) 
to obtain information about the prevailing stereotype 
held by judges for such a person as David. Scores 
earned at this stage, before exposure to the critical 
data, hereafter are referred to as stereotype scores. All 
Ror. judges and five of the RP judges completed the 
stereotype assignment only two or three days before 
the critical information was provided. The five RP 
judges who participated in designing the role-playing 
situations completed the stereotype tasks about two 
weeks before the day on which they saw David perform 
in these situations. 

The TB group did not do the stereotype task; only 
one day of this group’s time was available for the study, 
and the entire day was required for their studying the 
test battery and making the final set of responses. The 
Guess group, on the other hand, was asked to do only 
the stereotype task. 

In point of time, David participated in the role- 
playing situations about four months after first con 
senting to participate in the project. On the assigned 
day, he was brought to the laboratory and was given 
an explanation as to the nature of the role-playing 
situations as well as a brief and very general descrip 
tion of the task confronting the judges. It was ex 
plained to him that all the judges and all the standard 
role-players except one—a person he had met briefly 
through the investigator—were complete strangers to 
him, persons it was unlikely that he would encounter 
outside of this research situation. 


RESULTS 
Differences in Over-all Accuracy 


The mean and range of scores for each of the 
groups of judges is reported in Table 1. Sig- 


nificance of differences in performance between 
the groups was tested by comparing means of 
the sum of ranks (1) for the several groups 
participating in the stereotype stage and in the 
final stage. In the former instance H = 4.099, 
with df = 2, 10 > p > .25. In the latter 
instance H = 2.739, with df = 2, .25> p> 
50. Significance of differences between stereo- 
type and final scores for RP judges and for 
Ror. judges was tested by Wilcoxon’s 7 
method for ranks (1). In neither case were the 
differences significant. It is possible to con- 
clude, therefore, that none of the three types of 
critical data improved accuracy scores (with 
respect to the specific type of criterion em- 
ployed in this study) beyond the level achieved 
by study of biographical facts alone. 

Because of the similarity in mean scores for 
all six judging tasks, general characteristics 
of the criterion test were examined to deter- 
mine whether difficulty of the test imposed a 
uniformly low ceiling on judge performance. 
It was found that 14 of the 37 items were 
correctly identified by 75% or more of the 
judges in at least one of the six subgroups of 
judges; 21 of the items were correctly identi- 
fied by 67% or more of the judges in at least 
one of the groups, and 29 items were identi- 
fied correctly by 50% or more of the judges 
in at least one group. In the remaining eight 
items, the correct alternative was chosen by 
at least 40% of the judges in one or more 
groups. A total of 25 items met the double 
criterion of having been correctly identified 
by at least 50% of the judges in at least two of 
the six subgroups, or by at least 67% of the 
judges in any one judging group. It was evi- 
dent from this inspection that the similarity 
of means was not a resultant of the criterion 


test’s containing a pool of easy items which 
most judges answered correctly under all 


TABLE 1 


MEAN AND RANGE OF SCORES ON CRITERION 
Test BY Five DIFFERENT JUDGING GROUPS 
Stereotype Final 

Judges 
Mean | Range | Mean | Range 
9-23 
11-16 
9-22 


14.78 
13.71 | 
15.49 | 


10-19 | 


Role-play 14.56 | 
11-20 


Rorschach 15.00 
Test battery — | 
Guess 12.00 
Nurses 10.81 


6-19 
$14, — ~ 
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circumstances and another pool of difficult 
items which most judges failed. 


Differential Accuracy 

Performance of the judging groups was 
analyzed in two different ways in order to 
ascertain whether there were significant differ- 
ences in accuracy on different parts of the test. 
These analyses had to do on the one hand with 
the types of tasks faced by the judges, and on 
the other hand with the degree of malad- 
justment implied in the item alternatives. 

Item iypes. Regarding the task approach, 
the items making up the criterion test posed 
three presumably different types of questions 
for the judges. One group of 17 items required 
judges to postdict David’s behavior in specif- 
ically described situations; a second group of 
9 items called for the specification of general 
characteristics of his behavior, and the third, 
containing 11 items, pertained to certain of 
David’s interests and attitudes. Mean scores 
for RP and Ror. groups on these three types of 
items are reported in Table 2. 

Differences in performance on the three 
types of items at a given stage were tested by 
means of analysis of variance of ranks in a 
twofold classification, based on the proportion 
of successes in each category by each individ- 
ual. For RP judges at the stereotype stage, 
chi square was 10.0, with df = 2, p < .01. 
For Ror. judges at the stereotype stage, chi 
square was 6.7, with df = 2, .02 < p < .05. 
No significant differences were found at the 
final stage. 

Differences in accuracy within item types 
at two different stages, i.e., between stereotype 
and final stage, were tested by Wilcoxon’s T 


TABLE 2 
MEAN AccurAcy Scores OF ROLE-PLAY AND 
RORSCHACH JUDGES ON THREE CLASSES OF 
ITEMS AT STEREOTYPE AND FINAL STAGES 


Mean Accuracy 
Scores 
Item Type pamatneinmnets 
Ror- 
schach 


Role- 
play 


.00 
11 
.00 
.00 
56 


57 


43 
43 
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Final 
Stereotype 
Final 
Stereotype 
Final 


Postdiction 
Characteristics 9 
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method. RP judges showed no significant 
change in accuracy on any of the three types 
of items; Ror. judges, however, showed a sig- 
nificant decrease (p = .05) in accuracy on 
postdiction items from stereotype to final 
stage. 

Maladjusiment index. The second analysis 
was aimed at investigating differential accu- 
racy in relation to a particular set of hypoth- 
eses or attitudes which judges might form 
about David as a consequence of exposure to 
particular kinds of data. Previous studies 
(2, 3) suggested that individuals relying pri- 
marily or exclusively on projective tests for 
their critical data tended to overestimate a 
subject’s maladjustive propensities. It was 
hypothesized, therefore, that the Ror. judges 
would tend to overestimate the subject’s 
maladjustive tendencies and hence choose 
those response alternatives most characteristic 
of a maladjusted person. 

To test this possibility, the 148 alternatives 
in the criterion test were first rated on a seven- 
point scale of adjustment—maladjustment by 
five judges working independently. Inter- 
judge agreement ranged from a correlation of 
53 to .71, with a median of .69. The median of 
the ratings for the correct alternative in each 
item was used to classify the items according 
to the degree of adjustment or maladjustment 
implied in the correct alternative. Three groups 
of items were delineated, one consisting of the 
12 items whose correct alternative received 
the highest median ratings on adjustment, 
another of the 12 items whose correct alterna- 
tive received the highest median ratings on 
maladjustment, and a third group containing 
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the 13 intermediate items. Responses of the 
RP, Ror., Nurses, and TB judges were tallied 
to yield subscores for these three classes of 
items. Group means for these accuracy sub- 
scores are reported in Table 3. 

Differences between scores at stereotype and 
final stages for a given group of judges were 
tested by Wilcoxon’s T method. Differences 
between groups of judges for a given type of 
item at either stereotype or final stage were 
tested by the method of analysis of variance 
of ranked scores. Significance levels of the 
observed differences are reported in Table 4. 
For RP judges, difference in performance on 
adjustive items as compared with maladjustive 
items was significant at both the stereotype 
stage and the final stage; for Ror. judges, such 
a difference was found to be significant only 
at the final stage. The three groups of judges 
participating in the stereotype task differed 
significantly in performance on the malad- 
justive items, but not on adjustive or inter- 
mediate items. At the final stage, however, 
after exposure to the critical data, the three 
groups of judges who completed that task 
showed significant differences in accuracy on 
both adjustive and maladjustive items. 

Finally, RP judges showed a significant 
decrease in accuracy on maladjustive items as 
they advanced from the stereotype to the 
final task, whereas Ror. judges showed a sig- 
nificant decrease in accuracy on adjustive 
items from the former to the latter stage. 

The major features and interrelationships 
of these differences as they pertain to RP and 
Ror. judges are quickly grasped from Fig. 1. 
Since each of the four alternatives to an item 
was assigned a maladjustment index by the 


rating procedure described earlier, it was 
possible to determine the mean maladjustment 
index of the preferred choices of all the mem- 
bers of a judging group for each item and, 
hence, for each of the three major groups of 
items. On a given item, for example, the mean 
maladjustment index was arrived at by multi- 
plying the frequency of selection of a given 
alternative by the maladjustment weight 
earlier assigned to that alternative and then 
determining the mean of these values for the 
four alternatives in the item. By extension, a 
similar mean maladjustment index could be 
determined for any group of items. Thus, the 
mean maladjustment index for all the choices 
by RP judges at the stereotype stage for the 
12 adjustive items was 3.3, for the interme- 
diate items 3.9, and for the maladjustive items 
3.4. By comparing the profiles for the two 
groups at the stereotype stage with those for 
the final stage, it becomes apparent that the 
net effect of the critical data provided to the 
two groups after the stereotype stage was to 
increase differences between them. RP judges, 
who tended to choose the more favorable 
alternatives at the stereotype stage, simply 
increased that tendency, whereas Ror. judges, 
who tended to choose the somewhat less favor- 
able alternatives at the stereotype stage, in- 
creased that tendency at the final stage. 

Figure 2 sheds further light on judges’ 
preferences. The alternatives for each item 
were ranked according to the magnitude of the 
mean of the maladjustment weight assigned 
to each of them by the five independent judges. 
Then choices of Ror., RP, and TB judges at the 
final stage were examined to determine fre- 
quency of preference for items of different 


TABLE 4 
P VALUES OF SIGNIFICANT DIFFERENCES BETWEEN RANKS OF ACCURACY SCORES ON SUBSECTIONS OF THE TEST 
AT STEREOTYPE AND FINAL STAGES 


Stereotype 


Comparison 
Judges 
RP Adj. > Mal 
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Items 
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ranks. Among RP judges, approximately 46% 
of all choices were for alternatives which had a 
rank of one (most adjustive), whereas TB 
preferences for alternatives with this rank 
totaled approximately 27%, and for Ror. 
judges 17 %. Among alternatives with the low- 
est rank (least adjustive), the position of RP 
and Ror. judges is reversed. As compared with 
the other two groups, TB judges displayed a 
fairly equal distribution of preference over the 
four ranks. The Nurse group closely approxi- 
mated the profile of the RP judges. 

The influence on accuracy of the conceptual- 
izations that led to these different preference 
patterns is apparent in Figure 3, which presents 
a breakdown in percentages of the total num- 
ber of correct choices by each judging group 
at the initial and final stages. As all the preced- 
ing findings would lead one to expect, the 
several judging groups tended to achieve their 
greatest accuracy on different parts of the test. 


Accuracy and Change 


A final point of interest has to do with the 
amount of choice change after study of the 
critical data in relation to accuracy. Table 
5 reports in percentages the incidence and 
direction of shift on the adjustive, intermedi- 
ate, and maladjustive items for RP and Ror. 
judges, together with the final preference of 
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TB judges. The abbreviations R and W stand 
for right and wrong. Thus, R-R signifies 
instances in which a correct choice was made at 
the stereotype stage and remained unchanged, 
whereas W-R refers to those instances in which 
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TABLE 5 
DISTRIBUTION OF CORRECT AND INCORRECT CHOICES IN PER CENT FOR FINAL STAGE JUDGES, INCLUDING 
INCIDENCE OF CHOICE CHANGES FOR RP AND Ror. JUDGES 


Final Correct 


Item Type 


W-R Total 


Adj. 
RP 
Ror 
TB 

Int. 
RP 
Ror 
TB 

Mal 
RP 
Ror 
TB 

Total 
RP 
Ror. 
TB 


choices were incorrect at the stereotype stage 
and were changed to correct alternatives at 
the final stage. In the W-W category, two 
subgroups are distinguished, Weame 
instances in which a choice was incorrect at 


(those 


the stereotype stage, and the same incorrect 
choice was repeated at final stage) and Waite 
(those instances in which the shift involved a 
change from one incorrect alternative to 
another incorrect alternative). 

The first point of interest is the constancy 
in percentage of change within RP and Ror. 


groups on all three categories of items, on the 
one hand, and the difference in amount of 
change between the two groups, on the other. 
In each of the three groups of items, RP judges 
changed approximately 40% of their initial 


choices and left approximately 60% un- 
changed. Ror. judges exhibited a _ reverse 
tendency in about the same proportion, chang- 
ing slightly more than 60% of their initial 
choices and retaining slightly less than 40%. 
Although not reported in the table here, least 
change was found among the four psychiatrists 
in the RP group, who changed approximately 
34% of their initial as compared 
with 47% for the psychologists in that group. 
There was virtually no overlap in the dis- 
tribution of frequency-of-change scores be- 
tween RP and Ror. judges; the difference 
between mean ranks of these scores for the 
two groups was significant at the .01 level. 


choices, 


R-W 


Final Incorrect 
Ww-W 
Weame Waitt 


Total 


045.03 115 
06 085 .235 
— — 205 


11 04 .20 
.075 .245 
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.065 .285 
.06 .16 
— .205 


0 
A 
.58 


Approximately 55% of all RP errors re- 
sulted from failure to modify initial choices, 
and such changes as were ventured increased 
slightly the accuracy score on adjustive items 
and decreased it slightly on maladjustive items. 
Conversely, only 30% of all Ror. errors re- 
sulted from failure to change, 70% coming 
from changes in initial selections, effecting a 
slight loss in accuracy on adjustive items and 
a slight increase on maladjustive items. 

No significant relationship was found be- 
tween amount of change and final accuracy for 
individual judges. 

I iSCUSSION 

Undoubtedly the single most fundamental 
problem faced in th=: study of diagnostic be- 
havior is the character of the criterion to be 
employed. In the present investigation, the 
objective was to avoid such second-order 
criteria as ratings by other judges or patently 
objective but exceedingly complex criteria 
such as a subject’s scores or patterns of re- 
sponse on tests. Instead, an attempt was made 
to deal with characteristics of the subject’s 
spontaneous reactions in a presumably repre- 
sentative sample of his own life experiences. 
Hence, the majority of items in the original 
54-item version of the criterion test were post- 
diction problems. One difficulty with such a 
criterion instrument is that it requires that 
judges be provided with a reasonable amount 
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of background situational information for each 
item. The attempt to assess the effect of this 
type of information on judges by having them 
make preliminary responses may foster cer- 
tain “sets” which influence their appraisal of 
data at later stages. 

Aside from general familiarity with the 
“halo-effect,”’ not a great deal is known about 
the influence of the initial set or attitude of 
diagnosticians on the interpretation of data. 
In an earlier study (2), two teams of judges, si- 
multaneously observing the same two groups of 
subjects in role-playing situations, exhibited 
systematic differences in their evaluations, 
depending upon their relationship to the sub- 
jects. In the present instance, another factor 
apparently influenced initial set. The signifi- 
cant group differences between RP and Ror. 
judges in evaluation of the biographical infor- 
mation may be accounted for by differences 
in attitude toward the data at the stereotype 
stage. It is possible that mere knowledge that 
their next task would demand further blind 
the Rorschach alone 


appraisal based on 


heightened the sensitivity or the responsive- 


ness of Rorschach judges to adverse signs in 
the biographical data. The similarity of these 
two sets of findings suggests the need for more 
thorough studies of the relationship between 
set and judgment in diagnostic situations. 

A much more basic difficulty with the pres- 
ent type of criterion arises in the selection of 
materials for item construction. The occurrence 
of a balanced collection of adjustive and mal- 
adjustive items in the present criterion test 
was not intentional; indeed, the maladjust- 
ment ratings were gathered after the test had 
been administered to RP and Ror. judges. 
One of the implications of this array of items 
was that David usually behaved in a manner 
characteristic of moderate adjustment, whereas 
fuller acquaintance with and a careful account- 
ing of all of David’s behavior over a long period 
of time might indicate that the true distribu- 
tion is more skewed. Hence, interpretation of 
the scores earned by the several groups must 
be made with caution, for if the true distribu- 
tion for David is in fact skewed in one direction 
or another, then the present results on differ- 
ential accuracy between groups are misleading. 
It should be pointed out, however, that the 
very considerable amount of data collecting 


that lay behind the construction of the cri- 
terion test was undertaken precisely in order 
to sample a very large collection of life situa- 
tions, and that viewed against this back- 
ground, David appeared to have neither more 
nor less than his share of personality assets 
and liabilities. 

Of far greater interest than the scores them- 
selves are the circumstances that produced 
essentially similar scores in groups of judges 
who appear to have held quite different con- 
ceptualizations of the subject. 

Kelly and Fiske (4), Soskin (3), and others 
have reported results indicating that access to 
additional data beyond some optimal amount 
does not increase the accuracy of appraisal 
or the prediction of human behavior. The 
present study clearly indicates that such a 
conclusion reflects only part of the story. 
Here, two groups of judges presented with 
identical data—biographical facts—formed 
slightly different conceptualizations of the 
subject. When each of the groups was then pro- 
vided with a different type of additional data, 
one group exhibited only a siight change in its 
initial conceptualization, whereas the other 
exhibited a marked change. 

The divergence occurring in this second 
stage must be presumed to have occurred be- 
cause the low-change (Role-play judges) group 
found the additional data to be essentially 
consistent with its initial hypothesis, i.e., the 
additional data revealed little which was not 
already known or inferable from the biograph- 
ical data; whereas the high-change group 
(Rorschach judges) found its additional data 
to be in large measure inconsistent with the 
initial hypothesis, i.e., the judges had obtained 
a considerable amount of new information to 
which they attached a high degree of credi- 
bility. 

Keeping in mind that the two groups diff- 
ered significantly on maladjustive items at the 
initial stage, it appears that for each group 
initial conceptualizations were effected through 
an integration of certain selected salient con- 
sistencies in the data, and that contradictory 
data were regarded by the judges as instances 
of tolerable inconsistency and, hence, were 
ignored. 

Apparently, once formed, such a relatively 
undifferentiated conceptualization is main- 
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tained regardless of the amount of additional 
data subsequently provided, so long as the ad- 
ditional data yield the same relative propor- 
tions of confirmatory and contraindicative in- 
formation. However, when a sufficient mass of 
new information which the judge regards as 
highly credible, relevant, and contraindicative 
becomes available, the initial formulation is 
likely to change. Much of the material pre- 
viously ignored or discounted may then be 
integrated into the new conceptualization, 
while evidence previously regarded as support- 
ing a different conceptualization may be dis- 
counted. 

This appears to have been the experience of 
Rorschach judges in the second stage. A 
sufficient amount of relevant, credible, con- 
traindicative information appeared in the 
Rorschach data to warrant abandoning the 
initial conceptualization in favor of an essen- 
tially new one. It need scarcely be pointed out 
that both the relevance and credibility of 
certain of the new information, at least with 
respect to the present criterion, was somewhat 
overestimated by the majority of the 
Rorschach judges, inasmuch as they aban- 
doned features of the initial conceptualization 
which were essentially sound even though 
based on biographical facts alone. 

In view of the consistency of preference 
trends among members of a given group of 
judges, it appears that, wittingly or unwit- 
tingly, the fundamental ingredient in the con- 
ceptualizations of the majority of judges was 
a gross judgment as to David’s adjustive vs. 
maladjustive propensities, and this judgment 
was brought to bear in the majority of decision 
situations in a fairly consistent manner. 

It seems reasonable to assume that when- 
ever a very large number of decisions is called 
for, each of which requires the evaluation of a 
substantial number of subtly related facts, 
the most parsimonious solution is to predict 
from a general characteristic rather than to 
attempt to integrate a large number of con- 
tingent probabilities for each separate decision. 

In general, the Rorschach findings are con- 
sistent with results obtained in two previous 
studies (2, 3) in which personality appraisals 
based solely or primarily on Rorschach infor- 
mation were found to err quite consistently 
in the direction of overestimating the degree of 
maladjustment. 


The performance of Role-play judges is 
more perplexing. An implicit basic hypothesis 
in this study was that situational behavior 
is largely a function of ego processes whose 
characteristics are only poorly represented in 
the perceptual response to the Rorschach 
cards. To assess these processes it is necessary 
to create test conditions which elicit them, 
and it was felt that role-playing situations 
would accomplish precisely this end. The 
present results, however, suggest that ex- 
perienced clinical observers, given a rela- 
tively long period of observation of a variety 
of interpersonal situations, were unable to 
make the discriminations that had been 
expected of them. The striking similarity in 
pattern of preferences exhibited by the nurses 
(who had only the grossest general informa- 
tion) and the Role-play judges commands 
attention. It suggests that for all the specialized 
information available to them, the Role-play 
judges as a group, both before and after ob- 
taining the specialized information, enter- 
tained essentially the same expectations of 
the subject’s behavior as would a group of 
unsophisticated young women operating solely 
on the basis of their general social experience. 
The results suggest the following possibilities: 


1. Role-playing behavior leaves the defenses suffi- 
ciently intact to preclude the possibility that most 
judges could detect maladjustive tendencies; or, 

2. The task assigned to judges in this study was 
so complex as to oblige them to make most decisions 
on the basis of a single over-all impression; or 

3. The majority of judges developed a positive 
affective response toward the subject which caused 
them to evaluate his behavior with a selective bias. 


SUMMARY 

In a diagnostic task where groups of judges 
were called upon to specify certain behaviors, 
characteristics, and interests and attitudes of 
a subject (a) on the basis of biographical facts 
alone and then (8) on the basis of either obser- 
vation of role-playing situations, interpreta- 
tion of a Rorschach protocol, or study of a 
battery of objective and projective tests: 

1. The five groups of judges involved did 
not differ significantly in over-all accuracy 
regardless of the type of information on which 
their judgments were based. 

2. Judges who observed the subject in role- 
playing situations and judges who studied the 
subject’s Rorschach protocol showed no greater 





accuracy after exposure to these types of data 
than had been achieved by study of biograph- 
ical facts alone. 

3. Information obtained from observing 
the subject in nine role-playing situations 
appeared merely to confirm the conceptualiza- 
tion which a group of judges had formed from 
study of biographical data. 

4. Information obtained from the Rorschach 
protocol increased the tendency of a group of 
judges to choose items characteristic of mal- 
adjustive behavior beyond that which was 
evident when the group studied only bio- 
graphical data. 

5. As compared with observers of role- 
playing, Rorschach interpreters showed a 
significantly greater tendency to change their 
selections after exposure to the critical data; 
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in neither group, however, was amount of 
change significantly related to increase or 
decrease in accuracy. 

6. Greatest flexibility of characterization of 
the subject was exhibited by a group of judges 
who based their selections on a diversified 
battery of objective and projective tests. 


REFERENCES 
1. WALKER, HELEN M., & Lev, J. Statistical inference. 
New York: Henry Holt, 1953. 
2. Soskin, W. F. Frames of reference in personality 
assessment. J. clin. Psychol., 1954, 10, 107-114. 
3. Sosxin, W. F. Bias in postdiction from projective 
tests. J. abnorm. soc. Psychol., 1954, 49, 69-74. 
4. Ketty, E. L., & Fiske, D. W. The prediction of 
performance in clinical psychology. Ann Arbor: 
Univer. Michigan Press, 1951. 


Received September 19, 1957. 





THE SOCIAL PSYCHOLOGY OF RORSCHACH VALIDITY RESEARCH 
LEON H. LEVY anp THOMAS B. ORR 


Indiana University 


ECENT years have seen an increasing 
appreciation of the possible contri- 
bution of the personality of the ex- 

aminer to the variance of projective test 
protocols (1, 4, 5, 6, 7, 10, 12). While many of 
the studies concerned with this problem may 
be subject to methodological criticism (8), 
they are all fairly consistent in their finding of 
a significant relationship between various 
personality traits of the examiner and the 
scores and interpretations of tests adminis- 
tered by him. Until the nature of the mecha- 
nisms mediating these relationships is fully 
understood, generalizations about the utility 
of projective tests per se seem extremely haz- 
ardous. For example, when Investigator A 
fails to replicate the findings of Investigator 
B, are his results a function of his personality 
or of the instrument or theory in question? We 
have no way of apportioning the variance. 
Such problems may be construed as falling 
within the broader purview of the sociology 
of knowledge (9), where the ideas and infor- 
mation of individuals or groups are related to 
their particular social status and values. 
Within such a context, one may ask to what 
extent the outcome and interpretation of re- 
search is conditioned by the circumstances 
under which it is instigated and conducted. 
This question does not refer to methodological 
issues but, rather, to the needs, interests, and 
commitments of the individual researcher and 
the institutions with which he is affiliated. 
Raising this query implies no derogation of 
either the scientific integrity or rigor of the 
researcher; it merely asks whether what may 
be called the social psychology of research is an 
important consideration in evaluating any 
substantial body of investigation or a poten- 
tially fruitful area for further investigation. 
Psychologists, like other scientists, rely 
upon the logic of statistical inference and ex- 
perimental control in attempting to divorce 
their findings from their own personal beliefs 
and prejudices. If they are justified in this 
reliance, then the outcome and interpretation 
of research concerning any controversial issue 


should not be related to the theoretical or 
institutional commitments of the researcher. 
Should this not be the case, important meth- 
odological and substantive considerations 
would necessarily follow. As a preliminary 
foray into this area, we have chosen to analyze 
in a fairly gross way the social psychology of 
Rorschach validity research. 

The Rorschach validity literature is par- 
ticularly well suited to these purposes. First, 
it is extensive. Second, it is far from univocal 
in its findings. Third, it is performed by re- 
searchers with varying institutional affiliations. 
And last, it seems possible to make certain 
plausible inferences regarding the dominant 
needs and interests associated with these 
institutional settings and the kinds of research 
findings that might be most consistent with 
them. The problem then is to determine 
whether such a relationship does in fact exist. 

We shall be concerned with the relation- 
ships between three dichotomized variables 
in this study: (a) the type of institution in 
which the research was conducted, whether 
academic or nonacademic; (6) the type of 
validity study, whether construct or criterion- 
oriented (3); and (c) the outcome or interpre- 
tation of the research, whether it was favorable 
or unfavorable with regard to the validity of 
the Rorschach. 

From a consideration of the presumably 
dominant interests and concerns of individuals 
affiliated with academic and nonacademic 
settings several hypotheses were generated. 
To begin with, it seems reasonable to assume 
that the investigator who owes his primary 
allegiance to an academic institution is mure 
likely to be interested in the theoretical aspects 
of a diagnostic instrument, i.e., its construct 
validity, than with its practical aspects or its 
criterion validity. This is not to say of course 
that the academician is unconcerned with the 
practical utility of tests—only that he is more 
likely to consider validity as contingent upon 
a thorough theoretical understanding of the 
instrument. The researcher working under the 
auspices of a nonacademic institution, on the 
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other hand, is in a setting where a diagnostic 
device must pay its way in the making of day- 
to-day clinical decisions. Hence, he is more 
likely to be concerned with its criterion validity 
than with its construct validity. Again, it is 
incorrect to suggest that such individuals have 
no theoretical interests or perform no theoreti- 
cal research. But research in a service setting 
is more frequently considered a luxury item, 
and, if conducted, it is more likely to be of a 
practical than theoretical nature. The first 
hypothesis, then, may be formulated as 
follows: Rorschach validity research performed 
in an academic setting is more frequently con- 
struct-oriented than criterion-oriented, whereas 
the reverse holds true for research conducted 
in a nonacademic setting. 

In line with the dominant kinds of enthusi- 
asms postulated for each type of setting, it 
seems reasonable to assume that research 
results consistent with these enthusiasms 
would be more highly valued than those not 
consistent with interests. Thus, the 
theoretically oriented psychologist may be 
more concerned with obtaining positive results 
in a construct validity study than in a criterion 
validity study, whereas the reverse holds true 
for the practically oriented psychologist. 
Accordingly, the second hypothesis asserts 
that in studies emanating from academic 
institutions there is a higher proportion of 
favorable results obtained in construct validity 
studies than in criterion validity studies, 
whereas the reverse holds true for studies per- 
formed in nonacademic settings. 

As a corollary of the second hypothesis, a 
third may be formulated: A larger proportion 
of construct validity studies performed in 
academic settings than in nonacademic settings 
is favorable with respect to Rorschach test 
validity, whereas the reverse relationship 
holds for criterion validity studies. 

Finally, implied in these three 
is a fourth hypothesis that the 
institutional setting, type 


these 


hypotheses 
three vari- 
ables of validity 
study, and interpretation 

These four hypotheses are all predicated 
upon the assumption that a relationship exists 
between the institutional affiliation of a re- 
searcher or the institutional aegis under which 


are interrelated. 


research is conducted, the kind of research per- 


formed, and the probability of obtaining 


certain types of outcomes or interpretations 
of data. To the extent that the null statements 
of these hypotheses may be rejected at sta- 
tistically reliable levels, this assumption gains 
in credibility. 
METHOD 

On the assumption that the bulk of 
Rorschach validity studies is published in the 
Journal of Abnormal and Social Psychology, 
Journal of Clinical Psyshology, Journal of 
Consulting Psychology, and the Journal of 
Projective Techniques, each article which ap- 
peared in these journals during the years 1951 
through 1955, inclusive, was inspected to see 
if it met the criteria for inclusion in this study. 
The year 1951 was chosen as a starting point 
because Cronbach’s (2) statistical critique 
of Rorschach research methodology appeared 
two years earlier. Given a maximum publica- 
tion lag of two years, research from 1951 on 
might be expected to show greater meth- 
odological sophistication than that done pre- 
viously. No attempt was made, however, to 
check the validity of this expectation. 


Criteria for Selection 

To be selected, a study had to test at least one hy- 
pothesis, explicit or implicit, relevant to the interpre- 
tation or to some application of the Rorschach test. 
While studies of reliability, internal consistency, ad- 
ministration and scoring techniques, examiner influ- 
ences, certain normative studies, etc., may have some 
implications for the validity of the test, they were not 
considered here as evidence on validity. Including 
some of these studies would have made the categories 
of validity closer to Cronbach and Meehl’s (3) defini- 
tions but would have introduced additional subjec- 
tivity in selection and classification. Studies which 
investigated widely used principles of Rorschach inter- 
pretation, such as “color shock,” were included even 
if the Rorschach plates had not been used, so long as 
the purpose had been to provide information specifi- 
cally relevant to Rorschach usage. It was not required 
that an article consider the Rorschach only; some am- 
bitious studies could conceivably have been divided 
into a number of separate articles, each considering 
hypotheses relevant to a single instrument. Only the 
Rorschach hypotheses were considered, however, in 
classifying such a study 

At least one statistical test that would allow a state 
ment as to the probability of the obtained results was 
also required. This criterion excluded purely expository 
articles, case studies, and a few other articles that 
offered scores or frequencies without any indication as 
to their significance. A further requirement, naturally, 
was that the studies be reported in sufficient detail to 
be subsequently categorized on the variables considered 
here 
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One hundred sixty-eight studies were found that 
met the indicated criteria. Each was coded for each of 
the three variables studied, together with additional 
identifying information on individual cards. All cards 
were placed in a common pile as they were filled out 
and were not referred to until all articles had been 
classified. 


Definition of Categories 


The institutional variable. Institutional affiliation was 
dichotomized along the academic-nonacademic axis. 
Studies were classified on the basis of the affiliation of 
their senior author. If the location listed under his 
name was a college or university, including schools of 
medicine, the setting was considered academic. All 
other affiliations were considered nonacademic. When- 
ever a footnote indicated that the study had served as 
the basis for an academic degree, the setting was 
classified as academic even though the author was 
associated with a nonacademic institution at the time 
of publication. Similarly, if a footnote indicated that 
the entire study had been conducted under a previous 
affiliation, the classification was made on the basis of 
the earlier setting. When the senior author listed more 
than one affiliation, the one which agreed with the 
affiliations of the majority of his co-authors was used. 
In the case of a single author with more than one affilia- 
tion, the first listed was used as the basis for classifi- 
cation. 

Type of validity study. Studies were classified as 
bearing upon either the construct validity or the cri- 
terion validity of the test. Cronbach and Meehl’s (3) 
definitions were used, with both predictive and con 
current validity studies subsumed under the heading 
of criterion validity. Any study in which the variable 
related to the Rorschach was considered to be of im 
mediate value in practical decision-making was placed 
under the criterion validity classification. A construct 
validity study was considered one in which the non- 
Rorschach variable was of primarily theoretical 
interest and could be used only indirectly, if at all, in 
the clinical decision-making process. This distinction 
is essentially the same as that Cronbach and Meehl 
make between studies where the primary interest is in 
predicting the criterion and the test behavior is only 
of minor or secondary interest and those where the 
major concern is with the test itself and the criterion 
is chosen on the basis of theoretical considerations. In 
those studies which seemed to have both criterion and 
construct validity implications, classification was based 
on what appeared to be the major concern of the author. 

The outcome or interpretation. On research outcome, 
studies were classified as favorable or unfavorable with 
regard to Rorschach validity on the basis of the inves- 
tigator’s conclusions from his own data. In those 
instances where several hypotheses or several tests of 
the same hypothesis were considered and where not all 
of them reached levels of statistical significance, the 
investigator’s evaluation of his findings still served as 
the basis for classification. Where the author made no 
clear statement regarding the interpretation to be 
placed on his data, a judgment was made by the 
present authors. 
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Clearly, investigators differ in the stringency of the 
criteria which they apply in the evaluation of their 
findings. However, for the purposes of the present 
study it seemed that the appropriate datum was the 
author’s own evaluation. Thus, the question remains 
of whether, granting such individual differences, they 
are less than or greater than any differences amorg 
types of institutional affiliation. 

RESULTS 

Interjudge agreement on the classification 
of 40 randomly selected studies judged inde- 
pendently by a second person on each of the 
three variables was 98% on the institutional 
variable, 93% on the type of study variable, 
and 90% on the outcome variable. While these 
are all sufficiently high to warrant confidence 
in the objectivity of the classifications which 
form the basis of this study, it is interesting 
to note that none was completely unambig- 
uous. 

All hypotheses were tested by means of chi 
square analyses applied to various combina- 
tions and rearrangements of the data pre- 
sented in Table 1. Table 2 presents a summary 
of the results of these analyses. In all instances, 
the tests were based on one degree of freedom. 
The over-all chi square testing the fourth 
hypothesis and applied to the entire table was 
performed by Norton’s (11) method for calcu- 
lating chi square for complex contingency 
tables. 

From Table 2 it will be seen that the first 
hypothesis received support at a high level of 
confidence. Apparently, the kind of validity 
study done is related to the setting in which 
it is performed. The second hypothesis is 
supported only in the case of studies performed 
in an academic setting; hence it must be con- 
sidered only partially confirmed. In the case 
of research performed in academic institu- 
tions, there is a significant relationship be- 
tween the kind of validity study undertaken 
and the probability of favorable or unfavorable 
results, but there is no evidence for such a re- 
lationship in studies issuing from nonacademic 
institutions. In an academic setting, the 
chances are better than two to one that a 
construct validity study will yield favorable 
results vis-a-vis the Rorschach, while the odds 
are almost two to one that a criterion validity 
study will yield negative results. 

The test of the third hypothesis reached 
significance at the .05 level for both the con- 
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TABLE 1 
DIsTRIBUTION OF Stupres AMONG VARIABLES OF 
SetTinc, Type oF Stupy, AND OUTCOME 


Academic Nonacademic 


Construct Criterion Construct Criterion 


Favorable 12 14 19 
Unfavorable 22 23 14 13 
TABLE 2 
SumMARY OF Resutts oF Cui SQUARE TESTS FOR 
Four HYPOTHESES 
S. 
7 
S Comparison x? p* 
l Type Study X Setting 7.04 .01 
2 | Results X Type Study (Academic) 12.32  .001 
Results X Type Study (Nonacad.)) .53 N.S 
3 Results X Setting (Construct) 3.48 .05 
Results X Setting (Criterion) 4.23) .05 


4 | Results X Type Study X Setting | 7.71 .01 


* All probabilities except that for Hypothesis 4 are for a one 
tailed test because the direction of differences was predicted in 


each Case 


struct and criterion validity studies, thus 
supporting the contention that for a given 
type of validity study, the probability of a 
positive or negative outcome varies with the 
kind of setting in which it is performed. Simi- 
larly, the fourth hypothesis, asserting the 
interrelationship of the three variables, re- 
ceived support at the .01 level of significance. 


DISCUSSION 


The results clearly support the hypotheses 
concerning the interrelationship among the 
institutional setting of research, the kind of 
research done, and the type of outcome. In 
turn, they strongly suggest that in approach- 
ing any body of research dealing with a partic- 
ular issue, one must inquire beyond the “box 
score” in order to evaluate the findings. There 
does appear to exist a social psychology of 
research. 

The relationship between institutional set- 
ting and type of study has serious implications. 
If it were found that construct validity studies 
generally had a better chance than criterion 
validity studies of producing positive results, 


there would be no cause for alarm except 
among those using the Rorschach to decide 
between such alternatives as brain damage or 
no brain damage, psychotic or neurotic, to 
shock or not to shock, to treat or not to treat, 
etc. Such a finding would imply that the test 
is not ready for such tasks or that we have 
not yet learned how to construe these alterna- 
tives in terms of the constructs measured by 
the test, assuming their relevance. But when 
the probability of a favorable outcome in a 
construct validity study drops from 70% in 
academic to 50% in nonacademic research, 
and the probability of a favorable outcome in 
a criterion-oriented study rises from 34% to 
59% from one setting to the other, it gives one 
pause in the marshalling of either pro- or 
anti-Rorschach evidence! 

The categories used in the present study 
were admittedly gross, and one may ask what 
one would find if the research were more finely 
classified in terms of problems investigated, 
techniques and procedures used, etc. Of no 
small interest and importance is the study of 
the characteristic research output of particu- 
lar individuals. Unsystematic observation 
certainly suggests that there are persons who 
consistently report either positive or negative 
findings with regard to the Rorschach. 

One may also be concerned with variations 
in quality of research, a problem that certainly 
must be attacked eventually. The articles 
reviewed all purported to follow accepted ex- 
perimental and statistical procedures, but the 
findings clearly indicate the presence of some 
sort of bias. Either some of the studies were 
methodologically weak, or present statistical 
concepts need further scrutiny. For example, 
does the .05 level of significance mean the same 
thing when used as the criterion for rejection 
of the null hypothesis in a construct validity 
as opposed to criterion validity study executed 
in an academic setting? On the basis of the 
present data, it seems clear that this is not the 
case. Other things being equal, a construct 
validity study has a 70% chance of producing 
significant results at this level or better, 
whereas a criterion validity study has only a 
34% chance of doing so. Surely some allowance 
should be made for such discrepancies in decid- 
ing whether or not a null hypothesis may be 
rejected and in stating the confidence with 
which this decision is made. 
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But, as Mannheim (9) points out, to accept 
the present findings as a basis for discounting 
certain studies because of their institutional! 
origin would be in serious error. The reported 
results not only pose a barrier in the way of 
those who would uncritically gather their 
support or intellectual ammunition; they may 
also hold the germ of greater insight into 
human behavior. Attempting to manipulate 
probabilities so as to compensate for base 
rates is only one solution to the problem 
raised here. A more challenging and potentially 
more rewarding approach lies in the intensive 
study of the behavior of the researcher himself. 
For, intentionally or not, he seems to exercise 
greater control over human behavior than is 
generally thought. By the investigation of the 
process of research as conducted in various 
settings, the biases and personalities of investi- 
gators, and the pressures and ideologies to 
which they may be subject, we may learn not 
only how to account for and reduce the kinds of 
discrepancies found in the present study, but 
also how to develop more powerful means for 
the prediction and control of human behavior. 

While the findings of this study have impli- 
cations beyond the realm of Rorschach validity 
research, their importance for this particular 
area is considerable. Together with demonstra- 
tions of examiner influences on test protocols, 
they make a complete examination of the re- 
search procedures and behaviors of Rorschach 
investigators imperative if any sense is to be 
made out of the swelling literature dealing with 
the test. 


SUMMARY 


In an attempt to determine the extent to 
which sociological variables may be important 
in contributing to the kind of research done 
and probabilities of certain outcomes, 168 
Rorschach test validity studies appearing in 
four journals during a five-year period were 
culled according to certain criteria. These 
studies were classified on three dichotomized 
variables: each study came from an academic 
or nonacademic setting, was primarily con- 
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cerned with construct or criterion validity, and 
was interpreted by its author as yielding re- 
sults which were either favorable or unfavor- 
able to the Rorschach. Four hypotheses were 
formulated and tested regarding the relation- 
ships between these variables; each was sub- 
stantially supported. The implications of these 
findings for the importance of the social 
psychology of research as an area of investiga- 
tion were discussed with particular emphasis 
on the problems involved in the evaluation 
of current Rorschach test validity research. 
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MULTIVARIABLE ANALYSIS OF THE CONCEPTUAL BEHAVIOR 
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ost studies of differences between 
schizophrenic and organically based 
disorders in thinking have employed 
either a form of analysis derived from the 
Goldstein-Scheerer (1941) dichotomy of ab- 
stract-concrete “attitudes” or one which 
assesses group differences in terms of “in- 
tellectual deterioration.” Since both of these 
conceptions imply that thinking occurs as a 
function of a single “ability” or faculty 
“abstract attitude” or “‘intellect”—they can 
provide reliable distinctions between schizo- 
phrenics and the brain-injured only in terms 
of differences in degree of loss of this ability. 
Thus, when differences are found with these 
forms of analysis, the organic group usually 
(but not always) shows greater concreteness 
(nonabstractness) or greater ineptness of per- 
formance in “intellectual” tasks. In addition 
to objections concerning limitations upon the 
interpretation of such findings, there is the 
added difficulty cited by Yates (1954) that the 
effects of possibly important control factors 
(e.g., age, intelligence, education, sex) usually 
have been ignored. Consequently, one cannot 
be sure of the sources to which such group 
deficits in a unitary ability can be attributed. 
There have been a few investigators 
(Cameron, 1938; Hunt & Cofer, 1944; Hutson 
& Shakow, 1949; Lidz, Gay, & Tietze, 1942; 
Rashkis, 1947) who have attempted distinc- 
tions between schizophrenic and brain-dam- 
aged conceptualization on bases other than 
the existence of greater or lesser impairment 
of abstract or of intellectual ability. More 
recently, studies by McGaughran (1954), 
McGaughran and Moran (1956; 1957), Moran, 
| Formerly at the University of Houston. 


at the Houston 
where part of 


Veterans 
the data 


2? Employed part time 
Administration Hospital, 


analysis of this study was conducted. 


McGaughran, and Leventhal (1957), and by 
Grassi (1947; 1953) have introduced two 
schemas permitting a multivariable analysis 
of conceptual behavior. Each of these schemas 
has been employed to assess group differences 
in conceptualization between schizophrenic 
and brain-damaged patients. 

One schema employs, at the present, two 
conceptual variables, ‘‘amount of social agree- 
ment” and “order of conceptual classification.” 
In one study (McGaughran & Moran, 1956), 
it was successfully predicted that the major 
difference between schizophrenic and non- 
psychiatric groups is the amount of social 
agreement or communality characterizing the 
concepts they employ. In another study 
(McGaughran & Moran, 1957), order of con- 
ceptual classification showed the greater differ- 
ence between schizophrenic and brain-dam- 
aged Ss, while the scores on this measure for 
the nonpsychiatric group fell between those 
of the two deviant groups. It was concluded 
that: “The differences in conceptual behavior 
between schizophrenic and _ brain-damaged 
patients suggest that the two groups should 
not be represented as evidencing the same type 
of conceptual disorder” (McGaughran & 
Moran, 1957, p. 48). 

Grassi (1953) reported a high degree of 
success in differentiating conceptual perform- 
ance among normal, schizophrenic, and brain- 
damaged groups. His schema may also be 
interpreted as consisting of two conceptual 
variables: ‘“‘abstract-concrete’” and ‘“com- 
plex-simple.” It is only fair to state that 
Grassi apparently has not considered his 
method in this way. For him (1953), the terms 
“simple” and “complex” only convey two 
levels of difficulty in tasks involving either 
abstract or concrete performance. Further, 
the single score used in the Grassi test is ob- 
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tained by summing the subscores for perform- 
ance at both levels of difficulty for both con- 
crete and abstract tasks. This usage of a single 
score is clearly consistent with the Goldstein- 
Scheerer (1941) formulation of the presence- 
absence (or loss) of abstract ability. Indeed, 
Grassi (1953) states that “the theory and 
rationale set forth by Goldstein and Gelb is 
that upon which the Grassi test is predicated” 
(p. 8). He considers his test simply as a more 
sensitive index of “impairment in both the 
concrete and abstract spheres” (p. 13). Con- 
sidering some of Grassi’s quantitative results 
and some of his interpretations of his data, 
however, it may be rewarding to examine his 
instrument from the viewpoint of an explicit 
multivariable form of analysis. 

The primary purpose of this study was to 
cross-validate the research findings of 
McGaughran and Moran (1957) and Grassi 
(1953) of differences in conceptual behavior 
between schizophrenic and __brain-injured 
groups. Three other related purposes were (a) 
to extend the “range of convenien.c’”’ of the 
McGaughran and Moran schema by applying 
it under different operational conditions, (0) 
to assess the predictive efficiency of both sys- 
tems of conceptual analysis, and (c) to in- 
vestigate the feasibility of considering the 
Grassi schema as one providing the opportu- 
nity for a multivariable analysis of conceptual 
behavior. 


METHOD 
Subjects 


Ss were 60 male patients at Topeka State Hospital, 
Topeka, Kansas. Psychiatric diagnoses served as the 
criteria for selection. Thirty patients diagnosed as 
schizophrenic reaction, paranoid type, who had been 
in the hospital for at least two years made up the 
schizophrenic group. Thirty patients diagnosed as 
showing a chronic brain syndrome (without psychosis) 
associated with cerebral arteriosclerosis, senile brain 
disease, alcoholism, brain trauma, or central nervous 
system syphilis composed the brain-damaged group. 
There were no significant differences between the 
schizophrenic and brain-syndrome groups in age, 
education, or estimated intelligence level.’ 


* Upon request, mimeographed copies of a table of 
means and standard deviations of the distributions for 
these variables will be provided by the writers 


on 


Procedure 


One examiner administered individually to all Ss 
the following tests of concept formation: (a) an object- 
sorting task, (6) a similarities task, and (c) the Grassi 
Block-Substitution Test. Before scoring was under 
taken, the identity of the individual test protocols was 
concealed by means of coding. Each separate item in 
the object-sorting and similarities tasks was scored for 
all Ss; then the next item was scored in a similar way, 
and so on. Interscorer agreement on the object-sort- 
ing and similarities tasks was 90% and 86%, respec- 
tively; because scores on the Grassi test are based sim- 
ply upon performance accuracy and time, no assessment 
of scoring reliability seemed necessary. 

Object-sorting task. Introduced by Goldstein and 
Gelb and modified by a number of others, the object- 
sorting task requires 19 different sortings of a group of 
32 objects familiar in everyday experience. The task 
materials and administration corresponded to those 
described by Rapaport (1946) and previously employed 
by McGaughran and Moran. The 19 responses by each 
S were scored according to the schema described pre- 
viously (McGaughran & Moran, 1956; 1957). In this 
schema,‘ extent of social agreement (or “publicness- 
privateness’’) and order of conceptual classification 
(or “openness-closedness’”’) are intersected to form 
four scoring quadrants termed closed-public, open- 
public, open-private, and closed-private. These quad- 
rants or “conceptual areas” are the basic scoring cat- 
egories; each response is classified as occurring within 
one of the four. 

The determination of the appropriate quadrant or 
area for a concept requires a cross-classification in which 
the effects of both variables can be taken into account 
with one notation. By summing the individual 5S’s 
scores in adjacent quadrants, two additional measures, 
“total public vs. private” and “total open vs. closed,” 
can be derived. These measures provide a notation of 
the effect of either variable, independent of the other. 
One additional measure, an “autistic index,” is de- 
rived by subtracting the sum of closed-public concepts 
from the sum of concepts scored as open-private. 

Similarities task. The items on the similarities task 
consisted of groups of two to four words which could be 
“categorized”’ under a single concept. Seventeen such 
groups of words were presented orally one at a time to 
the Ss, who were instructed to “tell what the words 
have in common—that is, in what way they are alike.” 

The scoring system developed (Moran, 1953; Moran 
et al., 1957) to classify each response is based upon 
the same multivariable form of analysis employed in 
the object-sorting task. Thus, although the two tasks 
differ considerably in a number of respects, the same 

*A revised set of subarea categories and supple- 
mentary instructions for scoring conceptual areas as 
used in this study has been deposited with the American 
Documentation Institute. Order Document No. 5067 
from the ADI Auxiliary Publications Project, Photo 
Duplication Service, Library of Congress, Washington 
25, D.C., remitting in advance $1.75 for microfilm or 
$2.50 for photocopies. Make check payable to Chief, 
Photoduplication, Library of Congress. 
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TABLE 1 
COMPARISON OF GROUP DIFFERENCES OBTAINED IN THE MCGAUGHRAN AND MorAN Stupy AND 


McGaughran and Moran Study 


IN THIS STUDY IN CONCEPTUAL AREA SCORES ON AN OBject SorTING Task 


Present Study 


Schizophrenic Brain- . , Schizophrenic | Brain- , , 
Measure (N = 37) Damaged F and p Value (N = 30) Damaged F and p Value 
, (N = 34) P (N = 30) 
. . Adjusted* . . Adjusted | 
M SD M SD F Ratio M SD M SD | F Ratio | 
Public 11.00 | 4.70 11.27 | 3.67 0.82 — 8.50 | 4.89 12.10 | 5.03 13.49 |<.01 
Closed 6.05 | 3.35 |11.18 | 3.29 | 25.34 <.001) 5.03 | 3.77 | 9.87 | 3.19 27.34 |<.001 
Closed-Public | 3.78 | 2.62 | 6.77 | 3.46 14.71 <.001) 4.00 | 3.37 | 6.90 | 3.09 | 11.59 |<.01 
Closed-Private| 2.27 | 2.11 | 4.41 | 2.93 3.78 <.06 | 1.03 | 1.11 | 2.97 | 1.83 22.84 |<.001 
Open-Public 7.22 | 3.98 | 4.50 | 2.74 2.43 — | 4.50 | 4.28 | 5.20 | 3.51 1.47 | — 
Open-Private | 5.73 | 3.95 | 3.32 | 2.48) 10.11 <.01 | 9.47 | 4.99 | 3.93 | 2.84 | 32.25 |<.001 
Autistic® index|11.95 | 5.95 | 6.55 | 5.13 22.99 < .001/15.47 | 7.40 | 7.03 | 4.78 27.28 |<.001 











* Analysis of covariance was used to control for the effects of intelligence on task performance 


> Open-private minus closed-public, plus 10. 


TABLE 2 
SIGNIFICANCE OF DIFFERENCES IN CONCEPTUAL AREA SCORES ON A SIMILARITIES TASK BETWEEN 


Schizophrenic 
Measure . silence ) : Bos 

M SD M 
Public 8.57 4.34 12.03 
Closed 6.03 3.43 10.63 
Closed-Public 4.40 3.28 8.63 
Closed-Private 1.63 1.84 2.00 
Open-Public 4.17 3.05 3.40 
Open-Private 6.80 3.60 2.97 
Autistic Index® 12.40 6.03 4.33 


SCHIZOPHRENIC AND Bratn-DAMAGED GROUPS 


Brain- Damaged 


30 Adjusted* 
a Pe p : Is 
SD 
3.14 13.49 <.01 24 — 
3.21 25.68 <.001 — .41 <.01 
3.16 25.01 <.001 — .35 <.01 
1.59 45 — —.12 —_ 
3.19 .30 — .58 <.001 
2.21 26.21 < .001 — .23 — 
4.62 32.05 <.001 .09 ~ 


® Analysis of covariance was used to control for the effects of intelligence on task performance 


© Open-private minus closed-public plus 1¢ 


TABLE 3 
CoMPARISON OF Group DIFFERENCES OBTAINED IN THE GRASSI STUDY AND IN THE PRESENT 
StupyY ON PERCENTAGE OF ACTUAL FarLures AT Eacu LEVEL OF PERFORMANCE ON 
THE Grassi BLock SuBSTITUTION TEST 


Grassi Study* 


Level of Performance Schizo- Brain- 
phrenic Damaged CR 
(N = 86) (N = 102)» 
Simple-Concrete 0.0 10.2 7.28 
Simple-Abstract 5.1 43.9 13.86 
Complex-Concrete 1.4 47.2 16.36 
Complex-Abstract 40.9 90.6 16.56 


Present Study 


Brain- 


Schizo- 
p phrenic Damaged CR p 
(N = 30) (N = 30)» 
< .001 1.3 19.3 5.53 <.001 
<.001 30.0 43.3 2.38 < .05 
<.001 20.0 42.0 4.15 < .001 
<.001 68.0 2.80 <.01 


82.0 


® Figures derived from Grassi’s (1953, p. 65) table of number of failures on each of 5 tasks on each level of performance. 
> Ns shown are for number of Ss; the total number of tasks failed and passed at each level of performance equals number of Ss mul 


tiplied by 5 


brain-damaged group exceeded the schizo- 
phrenics in number of failures at all levels of 
performance beyond the .001 level of signif- 
icance. On the other hand, in the present 
study, the p values for group differences in 


number of failures for the two performance 


levels involving abstraction drop to .05 and 
.01. This, of course, means that there was less 
group distinctiveness in impairment or “in- 
tellectual deterioration” when abstractive 
ability was involved. 

Scores on the Grassi test for each of the 
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seven measures used for object-sorting may be applied 
to the similarities task 
Grassi Block-Substitulion Test 
consists of a set of 24 cubes of identical size, painted 
diferent colors on the different sides. Five large blocks 
ere made out of 20 of the regular cubes; each block was 
composed of four cubes glued together to form a 
specific pattern. The four remaining cubes were to be 
put together by S in four different ways for each of five 
designs, using a model presented by the examiner as a 
referent. The four types of tasks required copying the 
block model on the top, on the top but in a different 
color, on the top and sides in the same color, and on the 
top and sides but in a different color. Levels of con- 
ceptual performance on these four types of tasks were 


Grassi’s (1953) test 


as simple-concrete, simple- 
and complex-abstract, 


characterized by Grassi 


abstract, complex-concrete 
respective ly 

In Grassi’s study, S’s performance on each of the 
20 tasks was scored on the basis of time and accuracy 
credits, but he employed only a single total performance 
score® in arriving at an estimate of group differences. 
He also reported, however, the number of complete 
failures for each experimental group on each of the 20 
tasks. To investigate the feasibility of adapting the 
Grassi schema to a multivariable form of analysis, the 
test data in the present study were scored not only in 
terms of total performance and number of complete 
failures, but also in general accordance with the multi- 
variable schema employed in the analysis of responses 
in the object-sorting and tasks. Thus, 
group differences in performance scores were assessed 
for each of the four “levels of performance’”’ or scoring 
“quadrants” in the Grassi schema 


similarities 


Hypotheses 


Object-sorting and similarities tasks. The initial in- 
vestigation by McGaughran and Moran (1957) dealt 
with a specific population (VA hospital patients) and a 
specific task (object-sorting). The present study is in- 
tended not only to check the validity of these findings 
in the usual sense of broadening the reference popula- 
tion, but also in terms of construct validity—i.e., testing 
a designated schema of multivariable analysis of con- 
ceptual behavior under varying operational conditions. 
Thus, since parallel systems of scoring were developed 
for the object-sorting and similarities tasks, the same 
hypotheses are presented for both sets of tasks. 

The genera] hypothesis of differences between brain- 
damaged and schizophrenic groups in both object- 
sorting and similarities tasks is that the brain-damaged 
group has a greater deficit in higher order concepts 
(loss of “ openness’’) 

The specific hypotheses stated in terms of the op- 
erations of both tasks in the study are as follows: In 
comparison with the schizophrenic group, the brain- 
damaged group employs (a) significantly more closed 
concepts; (5) significantly fewer open-private, (c) and 
significantly more closed-public and (d) closed-private 
concepts; and (e) scores significantly lower on an autistic 
index consisting of open-private minus closed-public 
concepts 


’Complete scoring instructions may be found in 
Grassi’s (1953) test manual 


Grassi Block-Substitution Test. In accordance with 
Grassi’s (1953) findings, the specific hypotheses, stated 
in terms of the operations of his study, are as follows 
In comparison with the schizophrenic group, the brain 
damaged group (a) scores significantly lower on total 
score and (b) demonstrates significantly more complete 
failures on all four levels of performance but with more 
extreme group differences in performance occurring in 
the simple-concrete and complex-concrete tasks. 

To test the feasibility of adapting the Grassi schema 
to a multivariable form of analysis, it was further 
hypothesized that brain-damaged patients score sig- 
nificantly lower on the (a) simple-concrete and (6) com 
plex concrete levels of performance 


RESULTS 


To provide a statistical control over the 
effects of intelligence, group differences on 
each measure were calculated by analyses of 
covariance except for counts of complete 
failures on the Grassi test; for these, analyses 
of differences in percentage were employed. 
The tables are arranged so that, when appro- 
priate, the results in the present investigation 
may be compared directly with those obtained 
in the initial studies. 

The results for the object-sorting measures 
for both the original and the present study are 
shown in Table 1. In the present study, all five 
measures involved in the hypotheses show 
significant mean differences in the direction 
predicted. 

The results for the similarities task are shown 
in Table 2. In this task, all five of the measures 
involved in the hypotheses show group mean 
differences in the direction predicted, and four 
of these (excluding closed-private) are sig- 
nificant at the .01 level or better. 

On the Grassi test, the group difference on 
total score was in the predicted direction and 
significant at the .05 level. While it is not 
possible to compute the significance of group 
differences in the Grassi study from the data 
provided, a comparison® of differences in mean, 
range, and group overlap between the two 
studies indicates quite clearly that the original 
groups differed at a considerably higher level 
of significance than the groups in the present 
study. 

The distribution of complete failures for 
each of the four performance levels on the 
Grassi test for both studies is shown in Table 3. 
It can be seen that in the Grassi study the 

*Upon request, mimeographed copies of a table 


showing these comparisons will be provided by the 
writers. 
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TABLE 1 


COMPARISON OF GrouP DIFFERENCES OBTAINED IN THE MCGAUGHRAN AND MorAN STUDY AND 
IN THIS STUDY IN CONCEPTUAL AREA SCORES ON AN OBject SortTinG TASK 


McGaughran and Moran Study Present Study 
oe aii Brain- wr : Brain- 
Measure werTr Y Damaged F and p Value wit t Damaged F and p Value 
- (N = 34) “sme | (N = 30) 
. . Adjusted* , l = Adjusted* . 
M SD M SD | 'F Ratio p M SD M SD F Ratio | 
Public 11.00 | 4.70 |11.27 | 3.67 0.82 — | 8.50 | 4.89 12.10 | 5.03 | 13.49 |<.01 
Closed 6.05 | 3.35 |11.18 | 3.29 | 25.34 <.001) 5.03 | 3.77 | 9.87 | 3.19 | 27.34 \< .001 
Closed-Public | 3.78 | 2.62 | 6.77 | 3.46 | 14.71 <.001) 4.00 | 3.37 | 6.90 | 3.09| 11.59 [<.01 
Closed-Private| 2.27 | 2.11 | 4.41 | 2.93 3.78 <.06 | 1.03 | 1.11 | 2.97 | 1.83 | 22.84 |<.001 
Open-Public 7.22 | 3.98 | 4.50 | 2.74 2.43 — | 4.50 | 4.28 | 5.20 | 3.51 1.47 | — 
Open-Private | 5.73 | 3.95 | 3.32 | 2.48 | 10.11 <.01 | 9.47 | 4.99 | 3.93 | 2.84] 32.25 |<.001 
Autistic” index!/11.95 | 5.95 | 6.55 | 5.13 | 22.99 |<.001/15.47 | 7.40 | 7.03 | 4.78 | 27.28 {|<.001 








® Analysis of covariance was used to control for the effects of intelligence on task performance 
> Open-private minus closed-public, plus 10 
TABLE 2 


SIGNIFICANCE OF DIFFERENCES IN CONCEPTUAL AREA SCORES ON A SIMILARITIES TASK BETWEEN 
SCHIZOPHRENIC AND BRAIN-DAMAGED GROUPS 


i] 
i! 


Schizophrenic Brain-Damaged 


Measure nh. SE adn. Tg p r ry 
M SD M SD 
Public 8.57 4.34 12.03 3.14 13.49 <.01 24 - 
Closed 6.03 3.43 10.63 3.21 25.68 <.001 — .41 <.01 
Closed-Public 4.40 3.28 8.63 3.16 25.01 <.001 — .35 <.01 
Closed-Private 1.63 1.84 2.00 1.59 45 — —.12 = 
Open-Public 4.17 3.05 3.40 3.19 .30 — .58 <.001 
Open-Private 6.80 3.60 2.97 2.21 26.21 <.001 — .23 _- 
6.03 4.33 4.62 32.05 < .001 .09 _ 


Autistic Index? 12.40 


* Analysis of covariance was used to control for the effects of intelligence on task performance 
> Open-private minus closed-public plus 10 


TABLE 3 


CoMPARISON OF Group DIFFERENCES OBTAINED IN THE GRASSI STUDY AND IN THE PRESENT 
Stupy ON PERCENTAGE OF ACTUAL FAILURES AT Each LEVEL OF PERFORMANCE ON 
THE GRassI Block SUBSTITUTION TEST 


Grassi Study* Present Study 
Level of Performance Schizo- Brain- Schizo- Brain- 
phrenic Damaged CR p phrenic Damaged CR p 
(N = 86)» (N = 102)» (N = 30)» (N = 30)" 
Simple-Concrete 0.0 10.2 7.28 <.001 1.3 19.3 5.51 <.001 
Simple-Abstract 5.1 43.9 13.86 <.001 30.0 43.3 2.38 < .05 
Complex-Concrete 1.4 47.2 16.36 <.001 20.0 42.0 4.15 < .001 
Complex-Abstract 40.9 90.6 16.56 <.001 68.0 82.0 2.80 <.01 


® Figures derived from Grassi’s (1953, p. 65) table of number of failures on each of 5 tasks on each level of performance. 
> Ns shown are for number of Ss; the total number of tasks failed and passed at each level of performance equals number of Ss mul 
tiplied by 5 


brain-damaged group exceeded the schizo- levels involving abstraction drop to .05 and 
phrenics in number of failures at all levels of | .01. This, of course, means that there was less 
performance beyond the .001 level of signif- group distinctiveness in impairment or “in- 
icance. On the other hand, in the present tellectual deterioration” when abstractive 
study, the p values for group differences in ability was involved. 
number of failures for the two performance Scores on the Grassi test for each of the 
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TABLE 4 
SIGNIFICANCE OF DIFFERENCES IN LEVELS OF PERFORMANCE SCORES ON THE GrRassI BLOCK 
SUBSTITUTION TEST BETWEEN SCHIZOPHRENIC AND BRAIN-DAMAGED GROUPS 


Schizophrenic 
(N = 30) 


M SD M 


4.93 0.08 37 
3.50 1.71 83 
4.00 1.63 90 
1.60 1.69 .90 


Level of Performance 


Simple-Concrete 
Simple-Abstract 
Complex-Concrete 
Complex-Abstract 


® Analysis of covariance was used to control for the effects of int: 


four levels of performance are shown in Table 
4. The resuits indicate that, as predicted, the 
brain-damaged group shows significantly (p = 
.05) more impairment than the schizophrenics 
on the simple-concrete and complex-concrete 
levels of performance, but the groups do not 
differ significantly on the performance levels 
involving the abstractive process. 

In a final analysis of the Grassi test data, 
rank-order correlations (V = 60) were com- 
puted between all possible combinations of 
score distributions for the four performance 
levels. The values for the correlations were all 
positive and ranged from .57 (‘or complex- 
concrete vs. complex-abstract) to .72 (for 
simple-concrete vs. complex-abstract). All 
correlation values were significant at the .001 
level. A corresponding analysis of the amount 
of association between the “‘publicness-private- 
ness” and “openness-closedness” variables in- 
dicated that distributions of scores obtained 
in these measures were not significantly re- 
lated in either the object-sorting or similarities 
tasks. 

DISCUSSION 
Efficiency of Individual Prediction 

The total group mean was used as a cutting 
point for the major predictor variables on each 
of the three tasks; for the object-sorting and 
similarities tasks, the major predictor was the 
sum of “closed” concepts and, for the Grassi 
test, it was the total score. On the object- 
sorting task, the number of correct predictions 
for the schizophrenic group was 24 out of 30 
(80%); for the brain-damaged group, it was 
25 out of 30 (83%). The similarities task data 
yielded correct predictions for 23 out of 30 Ss 
(77%) in each group. The number of correct 
predictions obtained’ with the Grassi test was 

7 Had the cutting point suggested by Grassi (1953, 
p. 30) been used, the percentage of correct predictions 


Brain-Damaged 
(N = 30) 


Adjusted 
F Ratio* 
SD 

4.68 < .05 
1.81 
4.49 
2.40 


< .05 


“lligence on test performance 


19 out of 30 (63%) for the schizophrenics and 
14 out of 30 (47%) for the brain-syndrome 
group. 

When one considers the amount of selectiv- 
ity employed in this and in similar studies to 
form relatively “pure” experimental groups, 
(e.g., selecting brain-damaged cases without 
psychosis), it seems reasonable to conclude 
that none of the three tasks in its present form 
appears to be a very effective clinical instru- 
ment for independent diagnosis of brain dam- 
age. However, it seems likely that predictive 
validity in this area might weil be increased 
through the development of a more sensitive 
set of task or scoring characteristics within the 
present schematic contexts. 

It should be stated that Grassi quite clearly 
emphasized that scores on his test should 
never be used apart from qualitative behav- 
ioral signs in making clinical judgments. 
Nevertheless, in addition to Yate’s (1954) 
criticism of this stand, it may be added that 
most of the behavioral signs (e.g., perseveration, 
requests for reassurance) suggested by Grassi 
are in no way specific to his test and, con- 
sequently, would not enter into a judgment of 
its diagnostic effectiveness. 


Cross-Validation of the Two Schemas 


“Conceptual area’ schema. An analysis of 
the object-sorting data in Table 1 indicates 
that the findings of group differences in the 
two studies were much the same, except that 
there was a much stronger inclination for the 
schizophrenic group in the present study to 
use open-private sortings at the expense of 
lowered scores on the open-public measure. 
This difference can possibly be accounted for 
in terms of a corresponding difference in sam- 
pling. The significantly (6 = .001) older 


would have been 33% for the schizophrenic group and 
67% for the group with brain-damage. 
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schizophrenics of the present sample had, for 
the most part, been hospitalized for a number 
of years as contrasted with the younger group 
of VA patients used in the original study. The 
greater degree of chronicity for the present 
group, then, may be associated with a greater 
tendency to employ open-private sortings, a 
type of conceptualization which presumably 
would increase with increased social with- 
drawal. 

Closely equivalent results were obtained in 
the present study with the object-sorting and 
similarities tasks. Since the same multivariable 
schema was employed to develop correspond- 
ing systems of scoring under rather widely 
different task conditions, the conceptual vari- 
ables of amount of social agreement and order 
of conceptual classification do not seem bound 
to any one set of operations. The major differ- 
ence between schizophrenic and brain-damaged 
groups in their performance on both conceptual 
tasks was in terms of order of conceptual classi- 
fication. As in the original McGaughran and 
Moran study, these differences are character- 
ized as tendencies toward ‘“under-abstrac- 
tion” in the brain-damaged group and “over- 
abstraction” in the schizophrenics. 

Grassi ‘“‘single total score” schema. When the 
variables of age, sex, intelligence, and educa- 
tion were controlled, the brain-damaged group 
still showed a significantly greater “intellectual 
deterioration” (Grassi, 1953, p. 30). However, 
the theoretical implications of such a finding 
seem rather limited. In the absence of any 
further denotation, the determination of greater 
or lesser “impairment in both the concrete and 
abstract spheres” (p. 13), conveys nothing 
except more or less accuracy or speed in a 
series of tasks which, since the scores are simply 
additive, presumably vary only in level of 
difficulty. 

As for the practical implications of the re- 
sults for the total score measure, there are 
rather striking differences between the findings 
in the two studies. Although it is not entirely 
clear, it seems likely that these discrepancies 
may be attributed to differences in controls and 
in sampling. Concerning controls, Grassi states, 
“it was felt that the test would be at fullest 
efficiency if employed with subjects above the 
defective level. No limitations as to age, sex, 
or any other factor were found necessary” 
(1953, p. 60). Since age, sex, education, and 


intelligence were controlled in the present 
study, a considerable amount of the discrep- 
ancy between the two studies may be attrib- 
uted to differences in the effects of these 
factors. 

Concerning sampling, the greatest part of 
the discrepancy in the findings of the two 
studies consists of differences between the 
schizophrenic groups. For example, the schizo- 
phrenics in the present study scored signifi- 
cantly more failures at every level of perform- 
ance than the corresponding group in the 
Grassi study. Grassi reported that he chose 
equal numbers of ‘“‘deteriorated’”’ and ‘“‘non- 
deteriorated” schizophrenics “without regard 
to sub-classification” (1953, p. 62), and con- 
cluded at the end of his study that “non- 
deteriorated schizophrenics reveal no signifi- 
cant degree of impairment” (p. 66). However, 
he offered no criteria to distinguish between 
deterioration and nondeterioration, and this 
distinction is maintained nowhere in the 
actual analysis of his data nor in his setting of 
cutting points. Thus, it is not possible to make 
a definitive analysis of differences in amounts 
of “deterioration” in the two schizophrenic 
samples, except to state that the group in the 
present study presumably evidenced more of it. 


Grassi Schema as a Multivariable Form of 

Analysis 

In terms of the frequency of complete fail- 
ures in the Grassi test, there was less group 
distinctiveness between schizophrenics and the 
brain-damaged on the two levels of perform- 
ance in which abstractive ability was involved. 
This could conceivably be interpreted to mean 
that the simple-complex and concrete-abstract 
measures represent somewhat different vari- 
ables of conceptualization and that, con- 
sequently, the Grassi schema could be adapted 
to provide a system of multivariable analysis 
of conceptual behavior. 

There is one rather serious limitation to such 
an interpretation. To coincide with Grassi’s 
original procedure, the analysis of failures in 
the present study was in terms of total number, 
regardless of the number of Ss involved. When 
the present data were reanalyzed in terms of 
number of Ss showing failures at each of the 
performance levels, none of the group differ- 
ences proved to be significant. However, this 
difficulty was resolved to some. extent 
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(although variances were’ not homogeneous) 
by turning to an analysis of group differences 
in mean scores at each of the four performance 
levels. Here again the brain-damaged group 
showed significantly greater impairment than 
the schizophrenics in tasks involving simple 
perceptual fidelity with varying degrees of 
complexity, but the group difference in impair- 
ment disappeared in the two tasks requiring 
an abstractive reduction of visual images. 

While there seems to be some basis for 
inferring that the simple-complex and con- 
crete-abstract measures in the Grassi test may 
represent somewhat different sources of varia- 
tion in conceptual behavior, the degree of 
intercorrelation among the scores at all levels 
of performance indicates that the measures 
are far from independent, at least as they are 
presently employed. Greater separateness or 
independence of these variables may be 
achieved by a systematic alteration of task 
requirements. 


SUMMARY 


The primary purposes of this study were to 
attempt to cross-validate previous findings of 


two schemas for analyzing differences in con- 
ceptualization between schizophrenic and 
brain-damaged groups, and to investigate the 
feasibility of adapting one of these (the Grassi 
schema) into a multivariable system of anal- 
ysis corresponding in form (multivariable 
interaction) but not in content with the other 
(“conceptual area”) schema, used previously 
by the present investigators. 
Object-sorting, similarities, and block-sub- 
stitution tasks were employed. The “con- 
ceptual area” schema correctly predicted 
hypothesized group mean differences (p = .01 
or .001) in all five of the measures in object- 
sorting and in four out of five in the similari- 
ties task. The Grassi schema successfully 
predicted group mean differences (p = .05) 
in total score on the block-substitution task. 
Further analysis of the Grassi test data 
indicated that the brain-damaged group 
showed greater impairment in tasks involving 
simple perceptual fidelity but not in tasks 


requiring an abstractive reduction of visual 
images. It was suggested that greater inde- 
pendence of these conceptual variables may be 
achieved by experimentally manipulating 
task requirements. 
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A GENERAL FORMULA FOR THE QUANTITATIVE TREATMENT OF 
HUMAN MOTIVATION 
WALTER TOMAN 


Brandeis University 


AN always is motivated, but also 
is satisfying motives at the same 
time. While preparing food in order 

to eat, he is satisfying the motive to handle and 
manipulate things. Yet while eating, he still 
is motivated to get food on his plate, to cut 
his meat, or to chew it. Satisfactions of mo- 
tives, however, can be defined, among other 
ways, by the decrease of motivation that they 
bring about. And the decrease of motivation 
can, in turn, be determined by means of sub- 
sequent satisfactions. Other things being equal, 
the satisfaction value of a meal, i.e., its capac- 
ity to reduce a person’s hunger, can be inferred 
from the time at which he chooses to have his 
next meal. What holds for eating holds for any 
recurrent motive that can be _ identified, 
whether it is something as primitive and 
“physiological” as moving or even breathing 
or something as complicated as the motive to 
be in company or to write a paper. All of a 
person’s motives are forever on the increase 
from the moment they were last satisfied until 
they are satisfied again. On the other hand, 
satisfactions (of various degrees) reduce mo- 
tive intensities (by corresponding amounts). 
Some motives are always more urgent than 
others, but when they are satisfied, they make 
room for others. They fall back toward the 
rear of the race. 

This race changes its complexity as the in- 
dividual develops. New motives are being 
formed all the time. The only thing the new- 
born can do effectively and somewhat by 
intention is sucking. At the age of one year, 
however, he can suck, bite, grasp, hold, and 
move various things, he can crawl, sit, stand, 
etc. And the intervals between successive 
satisfactions of motives tend to increase and to 
become more variable. The newborn can keep 
awake for an hour at a time, and the adult 
for 16 hours. The newborn falls asleep under 
almost any circumstances after one-and-a-half 
hours of waking. The adult can, if necessary, 
go without sleep even after 36-hours of waking. 
Similar conditions could be shown to prevail 


with other recurrent motives, although varia- 
tions of opportunity to satisfy motives as well 
as the fact that motives can do something for 
each other to various degrees (i.e., substi- 
tutability) complicate matters. Anyway, the 
race increases in complexity with a person’s 
psychological development. It slows up, and 
more things happen simultaneously. 

A spectator trying to keep track of this 
race of motives within a person is hopelessly 
lost unless he focuses on only a few at a time. 
To do so throughout a person’s course of de- 
velopment is, of course, out of the question. 
What may be possible, however, is to trace 
a sample of a person’s motives over a certain 
stretch of time and estimate the kind of race 
that is being run. If we can find motives that 
are being satisfied recurrently by everyone 
and that are relatively uncomplicated by 
differences of opportunity and by substitute 
motives, or if we can interpolate for differ- 
ences of opportunity and include at least the 
most relevant substitute motives, we may be 
able to infer something very general and im- 
portant about a person in comparative, not 
absolute, terms: the level of his motivational 
development. 

For this purpose, the following general 
formula is offered: 


> eg =C 

i=1 
In words: the sum of intensity increments of 
motives that can be distinguished within a 
given person at a given time of development is 
equal to a constant. This assumption is claimed 
to be the simplest cf all possible holistic 
assumptions about a person’s motivation. 

In the formula, ¢, stands for intensity incre- 
ment of Motive 1, say sleeping; ¢ for that of 
Motive 2, say eating; «, for that of talking, 
etc. These increments can be estimated from 
the average intervals (i) between successive 
satisfactions of each motive by taking their 


If a person’s average 


reciprocals (« = 
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interval between successive occasions of sleep 
is 16 hours, the intensity increment per hour 
is 4g; if he eats every 4 hours, the intensity 
increment per hour is }4; etc. They can also 
from the differences between 
minimal intervals (d) between 


be estimated 
maximal and 
. If a person’s 
d 
minimal interval between successive occasions 
of sleep has been 6 hours, say during the past 
his maximal! interval 36 hours, his 


successive satisfactions (« = 


month, an 
range (d) would be 30 hours, and his intensity 
increment 149 per hour. These two types of 
estimates cannot be mixed within one sample. 
While the interval between successive satis- 
factions is the time from the end of one to the 
onset of the subsequent satisfaction of a mo- 
tive, it may often suffice to measure or esti- 
mate intervals from onset to onset of successive 
satisfactions. 

Letter m stands for the number of motives 
that can be identified within a person. Motives 
he m-2, m-1, are those that have 
already been satisfied, motives in the state 
of recurrence, such as sleeping, eating, talking, 
etc. Motives m, n-2, n-1, n, 
are motives that have not yet been satisfied, 


m-+i1,m+2,... 


motives in the process of formation, such as 
graduating from high school or college, own- 
ing a house, marrying a certain person, becom- 
img competent in electronics, etc. They repre- 
sent the ‘free energy portion” of a person. 

The C can be taken as a constant for a given 
individual throughout his development. This 
is, if nothing else, the simplest assumption 
about C and at least worth a try. C can per- 
haps be understood as the over-all motiva- 
tional energy with which a person operates 
throughout his life. 

The only additional assumption we need 
make about motivational development is that 
the number of distinguishable motives (n) 
increases with development. Therefore, the 
intensity increments with which any given 
recurrent motive grows until it is once more 
satisfied must, on the whole, decrease with 
development. Empirical evidence has already 
been indicated. 

The motive to sleep, e.g., takes an hour 
with the newborn to reach a given intensity, 
that at which satisfaction usually occurs, 
whereas it takes 16 hours with the adult. Hence 
the intensity increment per hour with which 
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the adult’s motive to sleep grows in intensity 
is 4 of that of the newborn. 

Since » becomes impracticably large even 
early’ in a person’s development, since new 
motives and new substitute motives are con- 
tinually being formed, since there are even 
substitutions by force (due to psychological 
losses, e.g., weaning), and since opportunity 
to satisfy motives is such a complicated and 
variable thing, this formula can be applied in 
practice only with discretion by means of 
appropriate sampling and in approximations. 
That, however, is possible. This paper pre- 
sents nomothetic evidence to test all aspects of 
the formula with subjects or groups of sub- 
jects about whose developmental differences 
there can be little doubt. 

EXPERIMENTAL EVIDENCE 
Decreasing Increments 

Let us see whether single motives reflect the 
developmental trend of decreasing increments 
(or of increasing average intervals between 
successive satisfactions and increasing ranges 
of intervals). Let us take sleeping, which is 
likely to be highly uncomplicated by differ- 
ences of opportunity. A person can usually 
impose the amount of sleep he needs on the 
circumstances of his life. In fact, he would not 
live for long if he could not. Let us take eating 
(meals and snacks) which is a little more 
opportunity-dependent. Let us take the mo- 
tive to have refreshments (such as Coca-Cola, 
milk shake, ice cream, coffee, etc., consumed 
in cafeterias, drugstores, and the like, but 
not together with a meal), and the motive to 
visit movie theaters, both of which are obvi- 
ously highly determined by opportunity. Let 
us study these motives in high school sopho- 
mores, high school seniors and college juniors, 
of both sexes, ethnically and socioeconomically 
matched, 15, 17, and 20 years old on the aver- 
age, and ask them to rate their average, their 
largest, and their smallest interval between 
successive periods of sleep for the past two 
months, between successive meals plus snacks 
(ignoring the night interval) for the past two 
months, and between visits to 
movie theaters and to refreshment parlinrs 
for the past year. The subjects’ estimates of 
average intervals will be used directly, whereas 
their estimates of largest and smallest inter- 
vals will be transformed into the range d 


successive 





the 
ents 
reen 
ges 
h is 
ffer- 
ally 
the 
not 
ting 
nore 
mo- 
‘ola, 
med 
but 
e to 
bvi- 
Let 
pho- 
iors, 
ally 
\ver- 
their 
veen 
two 
acks 
two 
; to 
rinrs 
s of 
-reas 
nter- 
ze d 


A GENERAL FoRMULA FOR HUMAN MOTIVATION 93 


TABLE 1 


AVERAGE INTERVALS BETWEEN SUCCESSIVE 
SATISFACTIONS OF MOTIVE 
(Means of estimated averages) 


Visiting Having 
Movie Refresh- 

Subjects Sleeping Eating Theaters ments 
(age groups) N (hours) (hours) (days) (days) 


15 years 15.3 29.:! 


3.8 
17 years 15.7 4.1 
4.2 


20 years : 16.0 18 


TABLE 2 
RANGES OF INTERVALS BETWEEN SUCCESSIVE 
SATISFACTIONS OF MOTIVE 
(Mean differences between estimated largest and 
estimated smallest interval) 


Visiting Having 
Movie Refresh- 
Subjects Sleeping Eating Theaters ments 
age groups) N (hours) (hours) (days) (days) 
15 years 29 6.9 76.9 
17 years 19 10.7 76.4 
20 years 32 13.9 46.7 


(difference between the maximal and the mini- 
mal interval between successive satisfactions). 

Tables 1 and 2 show the results. Data for 
sleeping and eating confirm significantly the 
predicted developmental trend (all ¢ tests of 
difference are significant at least on the .05 
level), whereas visiting movie theaters and 
having refreshments behave more irregularly, 
in ways perhaps attributable to differences in 
opportunity. 

Let us see whether a small sample of mo- 
tives also reflects the developmental trend of 
decreasing increments, perhaps more ade- 
quately than a single arbitrarily chosen mo- 
tive. The sum of intensity increments of the 
motives sleeping, eating, visiting movie the- 
aters, and having refreshments should de- 
crease with age. Computing intensity incre- 
ments from the reciprocals of the average 
intervals between successive satisfactions 
(from Table 1) yields the following for the 
15-year-old group: 


: 1 1 1 
2. “= 9531 ° 377 * 2048 * 


1 1 
+ — X= = 034. 

4.78 24 
Similarly, sums of 0.31 and 0.30 were found 
for the 17- and 20-year-olds respectively. 
Computing intensity increments from the 


TABLE 3 
Sums oF INTENSITY INCREMENTS FOR Four MOTIVES 
(Indices based on data in Tables 1 and 2) 


Subjects 
(age groups) N Leciy® Lecay” Lec) /Lers) 


15 years 29 0.34 0.31 0.92 
17 years 19 0.31 0.24 0.76 
20 years 32 0.30 0.21 0.68 


* Estimate based on summed reciprocals of average 
intervals between satisfactions, from Table 1. 

> Estimate based on summed reciprocals of range 
of intervals between successive satisfactions, from 


Table 2. 


reciprocals of ranges of intervals (given in 
Table 2) shows the same thing: the sum of 
intensity increments decreases with develop- 
ment (see Table 3). A parallel analysis based on 
the averaging of measures of intensity incre- 
ment computed for each individual separately 
shows the same trend, significant at the .05 
level. 

One can assume on various grounds that 
the variability, in our case the range, of inter- 
vals increases more rapidly with develop- 
ment or age than does the average interval 
between successive satisfactions. This rela- 
tionship is illustrated by the data on sleeping 
and eating presented in Tables 1 and 2. There- 
fore, intensity increments should decrease 
more sharply with development when esti- 
mated by the reciprocals of the range of inter- 
vals than when estimated by average intervals 
between successive satisfactions. The decreas- 
ing ratios in the third column of Table 3 
show that this is the case for our sample. 

We have assumed that m, the number of 
motives that are distinguishable within a per- 
son, increases with the person’s developmeni. 
This should hold for any sample of » as well. 
The motive to be in the company of pecple, 
for example, should give rise to an increasing 
number of varieties or derivatives with age. 
Hence the number of different people with 
whom a person has associated or, more opera- 
tionally, the number of such people that he 
can list within a given period of time should 
increase with development or age. 

This relationship was tested with another 
sample of subjects of the same three age groups 
(again high school sophomores, high school 
seniors, and college juniors, of both sexes, 
approximately matched as to socioeconomic 
and ethnic background) who were given five 
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TABLE 4 
SPONTANEOUS RECALL OF 
AND ACQUAINTANCES 


FRIENDS 


Average Number of “‘Friends 


as Acquaintances” 
Subjects (age or Acquaintances 


groups) 


N — 


Present in 
Class 


Listed in 5 
minutes 


36.6 
39.0 
54.4 


22 
20 
32 


15 years 
17 years 
20 years 





minutes in which to list the names of any 
friends and acquaintances they could think of. 
First names or nicknames were acceptable as 
long as the subjects could afterwards identify 
for themselves whom they had thought of. 
After counting the number of names in their 
lists, subjects were asked to state how many 
of those named were present in class. Table 4 
shows the results, which confirm significantly 
the postulated trend. Deduction of those of 
them who were present in class and may have 
been accidentally looked at rather than 
thought of would make the trend even more 
pronounced. 

The motive to be in the company of one’s 
most intimate friends is a sample of the many 
derivatives and _ substitutions continua 
arising from the motive to be in the company 
of people. On the assumption that intensity 
increments of motives should decrease with 
age, even in the case of such a sample, we 
should expect that longer separation and a 
greater variability of separations from one’s 
most intimate friends can be tolerated as a 
person develops. In order to test this, the 
same three groups of subjects were asked to 
list three most intimate friends and indicate 
the average, the longest, and the shortest 
separation from them for the past seven 
months (i.e., from September through March). 
The average lengths of separations were used 
directly, whereas the ranges of separations 
were computed from the longest and shortest 
separations indicated by each subject. For 
technical reasons those friends listed from 
whom they had been separated for all seven 
months were excluded from consideration. 
There were no such cases with the 15-year-olds, 
two with the 17-year-olds, and seven with the 
20-year-olds. Tables 5 and 6 show the results. 


They confirm significantly the postulated 
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TABLE 5 
AVERAGE DURATION OF SEPARATION FROM FRIENDS 


Average Duration of 
Separation (in days) 
From Most Intimate 


se N Friend Listed: 


groups) 


15 years | 22 
17 years | 20 
20 years | 32 





TABLE 6 
RANGES OF SEPARATIONS FROM FRIENDS 
(Mean differences between estimated longest and 
shortest intervals between get-togethers with 
three most intimate friends) 


Range of Separation (in 
days) From Most 
Intimate Friend Listed: 


Total 
Mean | R 
of All 
Three 


Subjects (age 
groups) 


15 years 
17 years 
20 years 


TABLE 7 
OF Periops OF CONSECUTIVE SCHOOL 
Work OvtsipE or CLass 
(Means of estimated average and maximum periods) 


DURATION 


Number of Consecutive 
Work Hours 


Maximum 


Subjects (age 
groups) 
Average 


15 years 
17 years 
20 years 


22 1. 
20 £ 
32 2.§ 


trend. The average separations from three 
most intimate friends as well as their ranges 
do increase with age. Even the variety among 
the three most intimate friends as indicated 
by the ranges of the means shown in Tables 
5 and 6 increases with age (see last column). 
Let us see whether a sample of the “free 
energy portion” of a person’s motives, i.e., of 
the not-yet-recurrent Motives m, m-+-1 m+2, 
n-2, n-1, n, also reflects the developmental 
trend postulated. Intensity increments of these 
motives should decrease with development, 
although the very fact that they have still to 
be satisfied for the first time makes them diffi- 
cult to appraise. However, the work done for 
the sake of such remoter goals, or for any one 
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of them, should reflect the decrease of intensity 
increments. A person’s periods of consecutive 
work or, put differently, the periods that he 
can tolerate involving delay of other satisfac- 
tions not inherent in work, should grow in 
length and variability with development. 
Hence the average as well as the largest num- 
ber of consecutive hours spent at work for 
school during the past seven months should 
increase with age among our three age groups. 
Table 7 shows significantly that they do. 


Substitution Effects 


One of the various complications that 
hamper the automatic use of the formula being 
proposed is the fact that motives can do some- 
thing for each other. They form substitution 
continua. It has been suggested that the 
formula can be legitimate!y applied to a motive 
only if at least the most relevant substitute 
motive is considered with it jointly. Following 
this suggestion, let us consider eating and 
smoking, which should be on a common con- 
tinuum at least for some of their aspects. A 
person who smokes more may eat less, and 
vice versa. But even sleep may be affected. 
Smokers claim that cigarettes can often post- 
pone sleep for a while. So even sleep and smok- 
ing should have something in common, be it 
no more than deeper breathing. In order to 
test these views, a group of 80 college students 
of an average age of 20.4 years was divided 
into 49 smokers (reporting average intervals 
and ranges of intervals between successive 
cigarettes of 2.1 and 14.7 hours respectively) 
and 31 nonsmokers. Both groups were asked 
to indicate their average, largest, and smallest 
intervals between successive occasions of sleep 
and meals (including snacks, but excluding the 
night intervals from consideration) for the past 
two months. 


TABLE 8 

SUBSTITUTION EFFECTS 

(Means of estimated average intervals between 
satisfactions, and of range between minimum 

and maximum intervals) 

Sleeping (in hours) | Eating (in hours) 

Subjects N ae 

Avg. int. | Range | Avg. int. | Range 

—_———_| ‘i ; 
49 16.3 
16.0 11.1 


Smokers 5 
Nonsmokers | 31 9 6.2 


| 12.6 4. 8.0 
3 


Table 8 shows the results. The average in- 
tervals between successive periods of sleep 
and meals as well as their ranges were sig- 
nificantly different for smokers and non- 
smokers. Smokers slept and ate somewhat 
less and could tolerate greater variability of 
either. They appeared to be slightly more 
“mature” when appraised by eating alone, 
and, apparently, smoking enabled them to 


appear so. 


Development in an Infant 


Let us see whether single motives also re- 
flect the developmental trend of decreasing 
increments in an individual case. Would the 
postulated trend for our indices of intensity 
increment hold for eating and sleeping in the 
development of an infant? In order to test 
this, the sleeping and feeding schedules of a 
breast-fed baby weaned at the end of five 
months were recorded for five consecutive 
days in the baby’s first, third, tenth, and 
thirteenth month of life. The means of these 
values are presented in Table 9. 

The results confirm significantly the postu- 
lated trends. Average intervals between suc- 
cessive periods of sleep and successive feedings 
as well as their ranges do increase with develop- 
ment. Comparison of Table 9 with Tables 1 
and 2 may make one wonder whether the 
infant studied was not more mature in eating 
than the 15-, 17-, and 20-year old. He shows a 
larger average interval and range of intervals 
at 13 months than they do. Such is not quite 
the case, however, since the night intervals 
between meals were not included in the ratings 
of the older subjects, whereas with the infant 
they were. Including the night interval would 


TABLE 9 
AND SLEEPING IN THE DEVELOPMENT 
OF AN INFANT 
(Mean interval between satisfactions, and mean range 
between minimum and maximum intervals, for 
five consecutive days in each month) 


EATING 


Eating (in 


Sleeping (in 
hours) 


Subject During hours) 
Avg. int. Range Avg. int. Range 


Ist month 0.8 0.4 a 
3rd month 1.4 1.0 3. 
10th month oe 2.6 4. 
13th month 4.8 3.2 4. 


a 
0 
8 
8 
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have raised the average intervals of all three 
groups by more than an hour, and their ranges 
by considerably more. 


Additional Evidence 

It was possible to secure a few additional 
data on the ontogenetic development of as 
primitive a “‘motive” as breathing. The author 
happened to record in skimpy diaries kept 
through several years of his youth, and for 
reasons partly forgotten, the lengths of time 
he could stop breathing at ages 11, 14, 16, and 
21. The measures recorded were 70, 92, 135, 
and 175 seconds respectively, which reflects 
the developmental trend to be expected. It 
should be noted, however, that breathing is, 
by degree, quite different from all other mo- 
tives mentioned. It is constantly being satis- 
fied with anyone alive, and something like the 
average or maximal interval between suc- 
cessive inhaiations depend, among other 
things, on such primitive constitutional givens 
as basal metabolism or the size of one’s chest. 
This should be pointed out, since breathing 
would weigh heavily if it were included in any 
pool of motives. Its intensity increment for an 
adult is about 3000 times larger than, e.g., 
the intensity increment of eating. The author 
would not be surprised, though, if a sufficiently 
complex treatment of breathing should yield 
as worthwhile an indicator of motivational 
development as sleeping does. In fact, he has 
some inconclusive evidence that the ratio of 
maximal over minimal intervals between suc- 
cessive inhalations (in which the latter was 
obtained from the number of inhalations per 
minute after climbing rapidly three flights of 
stairs) does increase with the level of motiva- 
tional development or ego strength as estab- 
lished by other measures, at least up to 20 
years of age. 

Finally the substitution continuum “genital 
satisfaction’? was studied with two separate 
groups of ten and nine male subjects. Mastur- 
bation and heterosexual intercourse were 
chosen as samples. The subjects had to rate 
in retrospect their average, largest, and small- 
est intervals between successive satisfactions 
for different years in their development. Aver- 
age intervals between successive satisfactions 
as well as ranges tended to increase with de- 
velopment, although this was true for hetero- 
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sexual intercourse only after marriage pre- 
sumably because society places restrictions on 
premarital relations. Another finding may 
perhaps be of greater interest. Masturbation 
plus heterosexual intercourse treated indis- 
criminately as genital satisfaction and rated 
by the subjects for the year following their 
first (as it happens, premarital) heterosexual 
intercourse showed greater average intervals 
and ranges of intervals between successive 
satisfactions than masturbation alone during 
the previous year. This means that although 
a “new” motive, heterosexual intercourse, has 
been added, the developmental trend of in- 
creasing intervals and ranges of intervals 
between successive genital satisfactions per- 
sists. It also means that the power (i.e., the 
capacity to reduce motive intensity) of this 
new motive is greater than that of other mo- 
tives assembled on the continuum. Clinical 
evidence confirms that of course. Masturbation 
tends to become rarer and more irregular as 
soon as heterosexual intercourse has begun. It 
does so even during that period of time in 
which the adolescent has found a love object 
and not yet begun to have heterosexual inter- 
course. 


DISCUSSION 


When does this end? When, on the whole, 
do the average intervals between successive 
satisfactions of motives and their ranges stop 
increasing and their intensity increments stop 
decreasing? And if there is such a thing as a 
decline with aging, they must. I have no defi- 
nite answer. Yet somewhere between birth 
and death there should be a reversal of this 
trend. Perhaps in the third decade of life, or 
in the sixth. For any particular motive, how- 
ever, this may happen much earlier or later. 
In fact, different substitution continua may de- 
velop at a different pace throughout a person’s 
development. Apparently the developmental 
pace may change, provided that opportunities 
to satisfy the motive in question change first. 
A person may be motivationally underde- 
veloped in his contact with physical things, 
but well developed in his contact with people. 
Or he may have been drastically disappointed 
by people and “regressed” in this area of his 
world, but developed in the other. This is 
why a sample of a person’s motivational de- 
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velopment should be taken from motives in 
different substitution continua. 

Is not something like friendship really be- 
yond the reach of all attempts at quantifica- 
tion? Well, certainly not in some respects, and 
I would not underestimate those. If I know 
that a wife and a husband, of their own accord, 
see each other only once or twice a year, or 
that two friends, although separated for 
several years at a time, have always kept up 
correspondence, I know something quite essen- 
tial about their relationship, no matter how 
much I do not know. In fact, I would not want 
to miss that information on any accounts, if 
I can have it. 

“All right,” the reader many say, “but take 
then your sweeping assumption that all mo- 
tives increase in intensity from the moment 
they have been satisfied to the moment they 
are being satisfied again, and that satisfaction 
reduces that intensity. Is that really true of 
complex motives? Do we not want to see more 
and more of a good friend or beloved, the more 
we have tasted of him or her?” Yes, but then 
we may not have had yet all we wanted. We 
may have “tasted,’”’ but not ‘‘eaten.”’ Another 
intriguing thing about people is that they offer 
aspects of satisfaction which 
sometimes exhausted even on a 
thousand contacts. Potentially, 
people are the most complex of all conceivable 
objects. So, while things are going well, we 
would by definition want to see more and more 
of them. On the other hand, there is no person, 
no matter how dear, of whom we could not 
get tired temporarily, at least in some respects. 
Suppose our spouse suddenly decides never to 
let go of us even for a minute. She or he might 
become oppressive before long. 

Have I then treated complex motives as 
if they were primary drives? If the major 
difference between the two kinds of forces is 
complexity, even the most complex motive 
must have something in common with primary 
drives. This is what a general theory of motiva- 
tion must assume, if it is not merely the pre- 
tense of a theory. If I had not treated them 
somewhat alike, I believe I should have. 

One may wonder whether I have not talked 
about actual events rather than motives. But 
how can we talk more safely about motives 
than in terms of actual events? If a person has 


innumerable 
cannot be 


successive 


taken a meal, visited a movie theater, or even 
married a certain person, can we not be sure 
of one thing: a motive to eat, a motive to 
visit a movie theater, or a motive to marry 
that person, respectively, was one of the de- 
terminants that led to that event? This is not 
to deny, of course, that there are other deter- 
minants too. 

Most of our data are based on self-ratings. 
Can they be relied on? I think so on two 
grounds. First, the subjects had no idea of the 
genetic nature of the studies, although they 
were asked to use pseudonyms rather than 
their own names for identification. This may 
have conveyed that the particular individual 
was not the focus of interest. Second, as in- 
dicated, the self-ratings concerned actual 
events, and very trivial ones at that. They 
involved no special, and possibly misleading, 
psychological concepts on their part. 

Of the two measures used, the range of 
intervals between successive satisfactions of a 
motive should be more sensitive than the 
average interval. In the case of self-ratings, 
the range is based on the subject’s estimate of 
his longest and shortest period of abstinence 
from the satisfaction of a motive over a given 
period of time, and both of these are likely to 
be relatively impressive events. The subjects 
should be able to rate them with greater 
validity than the average period of abstinence. 
The latter is the result of some semiarticulate 
computation or of an intuitive guess based on 
an altogether tacit computation, and that 
means greater possibilities of error. In accord- 
ance with this it was found that the increases 
with age of the ranges of intervals between 
successive satisfactions tended to be more 
significant than those of the average intervals. 

The reader may have tried to account for 
the empirical data in other terms. I have no 
objection. In fact, I could do it myself. How- 
ever, I think I can do it more consistently and 
economically with the formula suggested, and 
that was one of the reasons for suggesting it. 
The pieces of evidence should not be judged on 
their own merits, but rather on their compati- 
bility with the general context in spite of their 
relative heterogeneity. I would not think of 
them as conclusive. Yet they illustrate and 
confirm in a variety of ways what appears to 
me to be conceptually conclusive. 
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Is the formula suggested not altogether 
trivial? If waking periods increase in length 
and variability with development, must there 
not be more things to fill them? Yes, but why 
should waking periods increase in the first 
place? And why should any one of the things 
that fill them tend to increase and vary its 
abstention spans in similar ways? Not that 
the formula explains it; it postulates the gen- 
erality of this relationship. In addition, these 
recurrent satisfactions of distinguishable mo- 
tives can fill waking periods of a given length 
and variability in many different ways with 
different people and/or under different sets 
of opportunities. What is more: none of these 
satisfactions involves one motive only. Its 
occurrence has multiple determination. And 
all of these satisfactions can do something for 
one another to various degrees ranging from 
zero substitution value to its maximum (which 
is satisfaction of the identical motive). 

If we want to distinguish explicitly substi- 
tution continua among a person’s motives, the 
basic formula would have to be re-written as 
follows: 


np-—2 


+ Det De 


i=l i=l 


+doe=Dcg=C 


t=—1 t=1 


In this formula, C is the sum of intensity 
increments of all motives assembled on a given 
substitution continuum, say, oral stimulation 
and manipulation; p is the number of substi- 


tution continua on which m motives (in which 


p 
n = )° n;) are assembled. In words: the sum 
t=! 
of all sums of intensity increments of motives 
assembled on all of a person’s substitution 
continua equals C. If there were as many sub- 
stitution continua as motives (p = n, but also 
NM, = n), the motives would be unrelated to 
each other. If every motive appears on every 
substitution continuum, the number of dis- 
tinguishable motives becomes the product of 
the number of motives that can be identified 
on any one continuum and the number of 
substitution continua (",p = mn). This would 
reflect the most intimate interrelationship of 
motives possible. Obviously we will not find 


TOMAN 


this degree of interrelationship of motives or 
this degree of “integration” in reality, no 
matter how high we have a person climb on the 
ladder of maturity. We can expect, however, 
that a person with a given m will be the more 
mature, the greater the number of motives 
that are represented on more than one sub- 
stitution continuum, and the greater the num- 
ber of substitution continua on which they 
are represented. Hence, both m, and p tend to 
grow with a negative increment relative to n 
as a person develops, so that , will approach 
” and p will approach =, both, of course, with- 
p 

out reaching either. We will learn something 
about a person’s motivational maturity if we 
inspect, e.g., on how many of a given sample 
of substitution continua a given sample of 
motives appears. A person who satisfies the 
motive to make music by playing records, 
singing in a choral society, playing the violin 
in an orchestra, and writing compositions, is 
likely to have developed further in this area 
than a person who just plays records. Suppose, 
then that his motive to engage in conversa- 
tions may be satisfied by talking about music 
and musical events, about politics, history, 
art, and problems involved in maintaining a 
family, a house, and a car, whereas the other 
may be talking only about musical events (and 
even these could be restricted to record re- 
leases) and the maintenance of house and car. 
And suppose that in home contacts with 
friends, he may be dining, drinking, chatting 
about current events, listening to music, and 
even performing some, singing operas to- 
gether and playing cards, whereas the other 
would only be listening to music, chatting 
about current events (of a more restricted 
nature), and drinking. It is likely that the 
first man is more mature, not only in one area, 
but generally in his motivational development 
than the second. 

C was assumed to be a quantity character- 
istic of the individual. But changes in C are 
conceptually possible. The reader should be 
warned, though, against thinking that chang- 
ing moods, shifts from high to low feelings or 
vice versa, are evidence of changes of C. 
These moods or feelings are usually linked with 
specific changes of opportunity to satisfy 
motives, or with satisfactions themselves. 
There is still another consideration: the rela- 
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tionship between intensity increments of mo- 
tives and C may be a more complicated one to 
begin with. Possibly it is the squares of in- 
tensity increments of a person’s motives that 
add up to C. What speaks in favor of our initial 
formula in which C is taken as an intra-individ- 
ually invariant quantity, however, is its sim- 
plicity, and none of our empirical tests have 
contradicted it. The precise nature of the re- 
lationship between C and m remains to be 
clarified. We have assumed no more than that 
n increases with development, but it is likely 
that the rate of increase is, in principle, a 
function of C and of an individual’s chronologi- 
cal age at the time of inspection, perhaps a 
function of their product. 

If, moreover, satisfaction of motives has any 
bearing on learning or cathexis (of conditions 
under which motives can be satisfied), such as 
that the amount of ongoing learning or cathexis 
is a function of the amount of concomitant 
satisfaction, C should determine the average 
rate of learning or cathexis characteristic of a 
person, or vice versa. The decline of intensity 
increments of given motives due to the growth 
of m is, perhaps, not the only manifestation of 
C in the course of development. Another one 
may be the accumulation of the results of 
learning or cathexis, or the growth of knowl- 
edge. The two seem interrelated even on 
common sense grounds, but in our terms the 
amount of knowledge that a person has accu- 
mulated would be a function of the level of 
motivation he has reached, and vice versa. 
As a matter of fact, the amount of knowledge 
may often appear a more convenient indicator 
of a person’s general or even specific motiva- 
tional development. An individual’s psycho- 
logical world happens to be organized not only 
by substitution continua, but also by objective 
relationships of contiguity and instrumen- 
tality. In the infant’s mind, there is an order 
such as mother’s breast, the bottle, his own 
hands, the pacifier, etc., but also another one 
like mother’s breast, mother’s hands (that 
can touch him or pass the bottle), mother’s 
face (that can smile), mother’s dress (which 
may change), etc. The larger the number of 
identifiable motives for which mother has 
been learned to be instrumental or, in other 


words, the greater the number of aspects of her 
that have been cathected, or the greater the 
“knowledge” of mother, the farther advanced 
will the infant’s motivational development 
tend to be. 

On this reasoning, we should be able to 
infer a person’s theoretical C from his rate of 
learning or cathexis, and the latter can be 
tested. We could then determine the extent to 
which empirical approximations to C on the 
basis of our formula fall short of his theoretical 
C. The deficit would be the result of restrictions 
or losses of opportunity to satisfy motives, and 
would provide an index of the extent to which 
substitution of motives by force, or repression, 
has hampered his motivational development. 

Is all this of any relevance for clinical 
psychology or psychiatry? I think so. I think 
that the clinician is implicitly proceeding by 
all these considerations. I have only tried to 
articulate some of the aspects involved in the 
appraisal of an individua!’s motivational de- 
velopment or “ego strength.” What may be 
more, all aspects of the formula, if used with 
good sense and caution, can themselves be 
useful for clinical purposes. Repeated inspec- 
tion of a few motives such as sleeping, eating, 
smoking, or even reading, but also engaging in 
relationships with people in general, of the 
opposite sex in particular, or being in the open 
air, may tell us whether a person is continuing 
to grow, beginning to slow up, or even regress- 
ing. Such data may suggest psychopathology 
before a person shows any more conspicuous 
clinical symptoms. The rates at which specific 
neurotic symptoms recur may also be viewed 
from a similar perspective. 


SUMMARY 


The simplest version of a comprehensive 
formula for the quantitative treatment of 
human motivation and its development was 
suggested. It can be handled in practice by 
appropriate sampling and approximations. 
The major aspects of the formula were sub- 
jected to empirical tests. The formula proved 
feasible in all of the aspects investigated. 


Received September 25, 1957. 
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ONE studies psychotherapy inter- 

views, one is struck by the lawfulness 

and interconnectedness of the events 
in them. It seems obvious that a patient, once 
launched on a theme, is likely to continue on 
that theme; it seems that various forms of 
resistance are equivalent, so that one kind can 
replace another; and it seems that interven- 
tions by the therapist occur at predictable 
times and have predictable effects. Is there 
any way of transforming such impressions into 
quantitative statements which would be 
amenable to tests of statistical significance 
and other techniques of quantitative measure- 
ment? 

To make a quantitative description of 
psychotherapy, an investigator must assign 
each unit in the interview to one of a set of 
categories. In the research reported here, the 
sentence was chosen as the unit to be classified. 
Each sentence, therefore, had to be assigned 
to one of the categories of our content-analysis 


system. The system of classification used and 


the rules for categorization were developed by 
Dollard and Auld (6). The categories referred 
to in this paper are only a few of the more 
than 60 in the system. 

Having assigned each unit to one of the 
categories, one can then study various ques- 
tions. Four questions are considered in this 
paper. 

1. Does the patient’s speech hang together, 
i.e., is the patient likely to continue with an- 
other unit belonging to the same category? 

Because the patient’s behavior hangs to- 
gether and because he is motivated by learned 
drives to be intelligible and logical, we expect 
his talk or silent thought about a topic to 
persist. For instance, if the patient in one unit 
is talking about his hostile feeling toward his 
wife, he may be expected to continue to talk 
about this in the next unit. Thus, a unit 
scored as “‘hostility’’ should often be followed 
by another unit scored the same way. A 

1 This investigation was supported by a research 
grant, M-648 (C3), from the National Institute of 
Mental Health of the National Institutes of Health, 
Public Health Service 
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similar prediction would be made about any 
content category. 

2. Are various forms of resistance equiva- 
lent as shown by their occurrence in units that 
follow one another? 

If the patient resorts to silence in a case 
conducted under the free-association condi- 
tion, he is breaking the free-association rule. 
Such behavior is believed by psychoanalysts 
(5, pp. 103-106) to be resistant. If silence is 
resistant, one might reasonably expect that 
communications of the patient preceding or 
following silence would also be resistant; that 
is to say, if the patient is resistant at a partic- 
ular time, it will be shown at one moment by 
silence and at another moment by resistant 
talk. According to psychoanalytic theory, 
therefore, silences should often be preceded 
and followed by resistant talk. 

Even when the free-association rule has not 
been laid down, it is reasonable to consider 
silences as resistant. According to psycho- 
analytic theory, communication from patient 
to therapist is necessary to progress. When the 
patient is silent, he is not communicating 
verbally with the therapist; therefore, he is 
failing to act in a way that could advance the 
work. 

Whether silence is in fact correlated with 
verbal resistance should, therefore, test the 
correctness of analytic views about silence. 
It should be said, however, that psychoana- 
lysts would consider the context in which the 
silence occurred when deciding whether it was 
resistant. Our study of silences does not in- 
clude this refinement; we are only testing 
whether in general—neglecting nuances of 
context—silences function as resistance. 

3. Does the patient’s activity affect the 
likelihood that the therapist will intervene 
and, if he does intervene, the kind of interven- 
tion he will make? 

According to psychoanalytic theory, the 
therapist ought to intervene when the patient 
is resistant, and the therapist’s intervention 
should be an interpretation of the resistance. 
Resistances must be dealt with to make it 
possible for the patient to continue to learn 
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more about himself, in other words, to con- 
tinue making the unconscious conscious (8, 
pp. 61-69). 

Thus, a therapist acting on psychoanalytic 
principles would be expected to intervene 
more often after resistant talk than after non- 
resistant talk, and he would be expected to 
intervene by interpretation. The more skilled 
the therapist, the better able he should be to 
identify resistances quickly and to interpret 
them promptly. 

4. Does the kind of intervention made by 
the therapist affect the subsequent activity 
of the patient? 

Any adequate theory of psychotherapy 
must account for the effect of the therapist’s 
interventions on the patient. Psychoanalytic 
theory, attempting to do this, describes some 
interventions as “interpretations,” that is, as 
attempts by the therapist to label unconscious 
responses of the patient (5, p. 82; 7). Other 
interventions are called “support,” “guid- 
ance,” “reassurance,” ‘“‘questions.”’ Analytic 
theory says that an interpretation of a resist- 
ance, if apt, interrupts the resistance and 
allows the patient to resume his task of learn- 
ing about himself. Client-centered therapists, 
on the other hand, believe that interpretations 
are likely to evoke resistance and interfere 
with the therapeutic process. According to 
these therapists, clarification of feeling rather 
than interpretation is the preferred technique 
(10). 

In this paper, results throwing light on 
these four questions are presented. Our aim, 
however, is somewhat more general than to 
answer these questions; it is to demonstrate 
the value of studying sequential dependencies 
in psychotherapy, or, in other words, the value 
of studying what-follows-what. 

The method, in general, is to study whether 
the kind of score assigned to Unit affects the 
likelihoods of various scores being assigned to 
Unit n + 1. For example, if one knows that 
the patient has spoken a sentence classified as 
“resistance,” does that help to predict what 
kind of unit will come next? 

We make no claim to originality in using 
content analysis to study psychotherapy (2) 
or even in studying what-follows-what. How- 
ever, we do believe that no one has previously 
worked with psychotherapy data in the form 
of transition probabilities. 


METHOD 


Material. Four psychotherapy cases, carried by 
four different therapists, were studied. Two of the 
therapists (who treated Patients A and B) had been 
fully trained in psychoanalysis and had considerable 
experience; the other two, having taken an introduc- 
tory course in psychotherapy, were having their first 
experiences as psychotherapists. All of the patients had 
applied for psychotherapy to a psychiatric outpatient 
clinic, and their treatment was carried on under the 
auspices of the clinic. 

Recordings were made of all interviews with the 
knowledge and consent of the patients. Patient A had 
73 sessions, Patient B had 17, Patient C had 12, and 
Patient D had 34. Although every interview was 
carefully transcribed, not all of the hours were scored. 
Eighteen hours from Case A, 9 from B, 11 from C, and 
12 from D were analyzed. The original purpose in 
scoring these interviews was to try out the developing 
content-analysis system. Therefore, the interviews 
chosen for scoring are not ideally representative of Cases 
A and D but come instead from the earliest parts of 
these cases. The hours selected from Case B are repre- 
sentative, since all odd-numbered interviews were 
scored. All hours but one of Case C were scored, the 
first being omitted because the scorers did not wish to 
deal with the special features of an initial interview. 

All the cases may be described as examples of psy- 
choanalytically oriented psychotherapy. In Cases A, 
C, and D, the therapist insisted on the free-association 
rule; in Case B, the therapist did not, making the in- 
terview more like an ordinary conversation. The thera- 
pist of Case B also talked much more than the other 
therapists, so that his utterances occupied about a 
third of the total time. Therapist A’s utterances took 
up about a fifth of the time, and Therapists C and D 
talked less than A. 

None of these cases was an outstanding success, 
and one—Case C—must be considered a definite thera- 
peutic failure. Although not outstandingly successful, 
Cases A, B, and D were, in our opinion, conducted with 
at least average skill, compared to the general run of 
cases in the psychiatric outpatient clinic. 


TABLE 1 
RELIABILITY OF CONTENT ANALYSIS 


Category 


For patient’s sentences: 
Anxiety 
Dependence 
Hostility 
Love 
Mild Agreement 
Resistance 
Sex 
Social Mobility 


For therapist’s utterances: 
Demand 
Interpretation 
Reward 


* All coefficients shown are statistically significant at the .001 


level 
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Content analysis. Each transcribed interview was 
divided into sentences according to the rules developed 
by Auld and White (3). Then two psychologists* inde 
pendently scored each sentence of the interviews in- 
cluded in this study. 

In evaluating the reliability of this scoring, one 
must consider each unit separately, because the scores 
are to be used in studying the relationships between 
single units and the next-following units. 

Thus, product-moment correlations between the 
total counts for various categories for whole hours are 
irrelevant.? To measure agreement on separate units, 
a rank-order correlation, tau (9, 1), was computed for 
each of the categories used in this paper. The coeffi- 
cients are shown in Table 1. 

Counting of sequences and computation of probabilities. 
The various sequences that were to be studied were 
counted, then were expressed as conditional probabil- 
ities. For example, the H — H sequence (score of H 
on Unit n, followed by score of H on Unit m + 1) was 
counted. The number of H scores was also counted 
Then the conditional probability of an H on Unit 
n+ 1, given that Unit » was scored H, was com 
puted. This probability is, of course, the number of 
H — H sequences, divided by the number of H scores. 

The possible scores for each unit can be considered 
“states” of the psychotherapy situation. These states 
are mutually exclusive and exhaustive, because the 
content analysis assigns each sentence to one and 
only one category. The conditional probabilities can 
be called transition probabilities, i.e., probabilities of 
the passage of the psychotherapy situation from State 
A to State B, if one wishes to adopt the terminology of 
mathematical statistics (4) 


RESULTS 

Coherence of patient’s speech. To study the 
likelihood of persistence of categories, we com- 
puted the transition probabilities of the cate- 
gories for each of our four cases. It was 
discovered that the categories do, indeed, per- 
sist. For instance, the likelihood of a sentence 
scored Sex is greater after Sex sentences than 
after other sentences, and the likelihood of a 
sentence scored Hostility is greater after hos- 


2 John Dollard and Frank Auld, Jr. 

* Such hour-total correlations, we have found, tend 
to be higher than correlations computed from scores on 
single units; for instance, in Table 1 the mediar. tau 
coefficient (based on single-sentence comparison) for 
patient’s categories is .58; the median r (based on hour 
total comparison) for the same categories is .87. The 
larger size of r’s can be attributed to compensating 
errors in the hour-counts of the two scorers. For in- 
stance, Scorer A calls Units 1 to 10 “resistance,” 
while Scorer B calls them “anxiety”; but Scorer B 
calls Units 53 to 62 “resistance,”’ while Scorer A calls 
these “hostility.’’ Considering both sequences of units 
together, the scorers agree that 10 of the units are 
“resistance,” even though they do not agree on the 
scoring of any of the separate units 
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tile sentences than after nonhostile senténces. 
Such a result was obtained for all of the cate- 
gories studied: Anxiety, Dependence, Hos- 
tility, Love, Resistance, Sex, Silence, and 
Social Mobility. All of these differences are 
statistically significant and quite large. Fur- 
thermore, these differences were found in 
every case studied. 

A typical finding is that for the Hostility 
category. The chances are 71 out of 100 that 
Sentence » + 1 was scored Hostility if Sen- 
tence m was scored Hostility. However, if 
Sentence m was not scored Hostility, the 
chances are only 3 in 100 that Sentence m + 1 
was scored Hostility. 

It occurred to us that this finding might be 
attributable, in part, to the tendency of a 
scorer to score a whole sequence of sentences 
the same way. In other words, the result 
wouid demonstrate a coherence in the scorer’s 
behavior, not in the patient’s. To exclude 
any effect of the scorer’s tendency to score 
successive sentences the same way, we there- 
fore analyzed the data again, using the scores 
of different scorers for the two sentences of 
each sequence. If Scorer A’s judgment was 
used in classifying Sentence m, Scorer B’s was 
used in classifying Sentence » + 1. As ex- 
pected, this procedure cut down the size of 
the differences found; but the differences re- 
mained, were still very large, and were still 
statistically significant. For example, it was 
found (when the material was analyzed in this 
way) that if Sentence » was scored Hostility, 
the probability was .51 that Sentence m + 1 
was Hostility; but if m was not Hostility, the 
probability was only .07 that » + 1 was 
Hostility. 

Equivalence of silence and resistant talk. Si- 
lence did, indeed, occur more often after resist- 
ant talk than after nonresistant talk (see Fig. 
1). The probability was .38 that a silence of at 
least 5 seconds would occur after a sentence 
scored as resistant; it was only .04 that a 
silence would occur after a nonresistant sen- 
tence. This difference is siatistically highly 
significant. Furthermore, silences were likely 
to be followed by resistant talk. Dividing each 
silence into 5-second units,‘ one finds that the 
silence would continue for another 5 seconds 


* Five seconds was chosen because the patient could 
utter one sentence in this period of time. 
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Fic. 1. Events Foittowinc SILENCE, FOLLOWING 
RESISTANT TALK, AND FOLLOWING NONRESISTANT 
TALK 


EXPERIENCED THERAPISTS (N=2) 

















Fic. 2. THEeRApiIst’s RESPONSE TO RESISTANT AND 
TO NONRESISTANT TALK OF PATIENT 


50% of the time. A resistant sentence occurred 
in 15 out of 100 cases, and nonresistant talk 
or an utterance by the therapist occurred in 
35 out of 100 cases. The .15 probability of 
resistant talk after silence may be compared 
to the .04 probability of resistant talk after 
nonresistant talk. The difference, of course, is 
highly reliable. 

These findings are what would be expected 
if silences were equivalent to resistant speech. 

Effect of patient’s activity on therapist’s be- 
havior. The two apprentice therapists were 
not more likely to intervene after resistant 
talk than after nonresistant talk. If they did 
intervene, they were equally likely to make an 
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TABLE 2 
LIKELIHOOD OF VARIOUS RESPONSES BY PATIENT 
AFTER INTERVENTIONS BY THERAPIST 


Therapist’s Interv ention 


Noninter- 


Patient’s Response Interpretive _ pretive 


Resistant talk 21 .20 
Nonresistant talk (exclud- 

ing Mild Agreement) 24 .54 
Mild Agreement 49 21 


Silence .06 .05 


interpretation or to make a noninterpretative 
remark. The interventions of these beginners, 
it can be seen, do not fit very well the analytic 
description of correct technique. 

The experienced therapists, on the other 
hand, were more likely to interpret resistance. 
Figure 2 presents the over-all results. Thera- 
pist A was more likely to intervene after resist- 
ant talk than after nonresistant talk (the 
probability figures are .24 and .18, respec- 
tively; the difference is statistically signifi- 
cant). Therapist B was about equally likely to 
intervene after resistant or nonresistant talk 
(.50, .47). But if he did intervene after resist- 
ance, he made an interpretation more often; 
38% of his interventions after resistance and 
29% of them after nonresistant talk were 
interpretative. This difference is statistically 
significant. 

It is apparent that the experienced thera- 
pists were better able than the inexperienced 
ones to identify quickly and to deal directly 
with resistance. Since interpretation is be- 
lieved to be the method whereby the therapist 
helps make the patient’s unconscious con- 
scious, by interpreting resistance these thera- 
pists were acting in accord with theory in the 
interventions that they made. 

It should also be noted that the experienced 
therapists talked more than the apprentice 
therapists in the cases studied. However, this 
finding should not be taken as evidence that 
skillful therapists must talk as much as 
Therapists A and B did in the cases studied 
here. In other cases that we have studied, a 
skilled therapist acted very adaptively with- 
out having to talk much. 

Effect of therapist’s intervention on patient’s 
behavior. We made a preliminary study of the 
effect of interpretations by noting what the 
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next sentence of the patient was after an inter- 
pretation and after noninterpretative inter- 
ventions. The results are given in Table 2. It 
can be seen that, if one may judge by what 
the patient says immediately afterward, in- 
terpretation did not produce a great increase 
in resistance. The probability of the patient’s 
saying ‘““‘Umhum” or “Yeah” was, however, 
much greater after an interpretation than after 
other interventions. (The difference is highly 
reliable.) No doubt this is so because most 
interpretations call for some answer from the 
patient, some indication that he agrees or 
disagrees. It should be emphasized, however, 
that we do not suppose that the patient’s Yes 
indicates more than polite agreement—“I 
heard you.” Such a reply cannot at all be 
considered evidence that the patient has ac- 
cepted and adopted the interpretation. 

Further studies on the effect of various 
kinds of intervention by the therapist are 
planned; it is hoped that they will throw some 
light on the controversies concerning what 
moves psychotherapy ahead. 


SUMMARY 


As a demonstration of the value of studying 
sequential dependencies in psychotherapy, 
data are presented bearing on four hypothe- 
ses: that the patient’s speech is coherent, that 


silence is equivalent to resistant speech, that 
analytic therapists are more likely to inter- 
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vene after resistances, and that interpreta- 
tions by the therapist produce an upswelling 
of resistance. The first three hypotheses were 
borne out, but the last one (derived from 
client-centered theory) was not. The authors 
believe that the results obtained justify wider 
use of this method of studying psychotherapy. 
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A COMPARISON OF MENTAL RETARDATES AND NORMALS ON 
VISUAL FIGURAL AFTEREFFECTS AND REVERSIBLE FIGURES! 


HERMAN H. SPITZ anp LEONARD S. BLACKMAN 
Edward R. Johnstone Training and Research Center, Bordentown, New Jersey 


HE question of whether mental retard- 

ates exhibit more rigid behavior than 

normals is far from settled. Lewin (12) 
reported that retarded children display more 
persistence in a single activity than do normals 
of approximately equal CA, although he found 
no significant difference between normal and 
retarded children in total satiation time. A 
more recent study (19) has indicated that re- 
tardates, although much more influenced by 
the support or nonsupport of the examiner, 
persist in monotonous tasks for longer periods 
of time than do normals of equal MA. In an 
experiment measuring the performance of 
equal MA groups on discrimination learning 
tasks, Stevenson and Zigler (15) reported that 
retarded children with a mean CA of 10.9 
years, as well as retarded adults, performed as 
well as normal children with a mean CA of 
5.0 years. Such results argue that low MA 
retardates are no more rigid than normals of 
equal MA. However, because the normals of 
that study were chronologically only 5 years 
of age—an age at which inflexible types of 
behavior have frequently been demonstrated 
(13, 16)—Stevenson and Zigler’s results do not 
dispel the notion that, in general, retardates 
may be described as rather rigid individuals. 

The present study proposes to compare the 
performances of retardates and normals of 
equal chronological age on (a) a test pre- 
sumed to measure perceptual rigidity (4) and 
(6) a test presumed to measure neural modi- 
fiability (18). The results of these comparisons 
may add to our understanding of the possible 
relationship between the construct of percep- 
tual rigidity and.the construct of neural 
mnodifiability. 

Although Lewin used the term “satiation” 
in its more usual sense of behavioral boredom 
or fatigue, Kéhler and Wallach (10) use it to 
describe a neurophysiological process which, 

1 This study is the first of a series designed to de- 
scribe mental retardation in terms of those specific 
components of perception, learning, cognition, and per- 
sonality which constitute the dynamic response com- 
plex referred to as intelligent or adaptive behavior. 
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among other things, is said to be responsible 
for visual figural aftereffects. These after- 
effects can be demonstrated quite readily with 
various kinds of figures. The subject (S) is 
asked to fixate on a figure (Inspection Figure) 
for varying lengths of time, at the conclusion 
of which a second figure (Test Figure) is 
presented. The Test Figure appears to be 
displaced away from the area previously oc- 
cupied by the Inspection Figure, and it may 
appear to change in size, luminosity, and 
depth. Displacement effects can also be found 
in the kinesthetic (5,8) and auditory (11) sense 
modalities. 

From their findings, Kéhler and Wallach 
concluded that the presence of a figural cur- 
rent tends to block the further presence of 
figural currents in the same area. It is this 
satiation process that is said to account for 
figural aftereffects, causing the Test Figure to 
move away from the area of impedance or 
satiation. 

The process of neural satiation is also said to 
account for the reversal of figure and ground, 
as illustrated in reversals reported during the 
fixation of such stimuli as the Necker cube or 
the Rubin Vase-Profile figure. However, Har- 
rower’s (4) modification of the Rubin figure, 
in which no steady fixation is required and in 
which consecutive changes are made on the 
structure of the stimuli, presumably makes 
her test more a measure of general perceptual 
rigidity than specific neural modifiability. 
Nevertheless, if different stages of the neural 
satiation process occur in all perception, then 
this process may play some role in the capacity 
to perceive both figures in a reversible figure, 
even though there is no fixation required of 
the S. 

If the assumption is made that neural 
satiation is a physiological process which is not 
only responsible for certain perceptual proc- 
esses, but is made manifest by these very 
processes, then tests of satiation are percep- 
tual pathways to neurological capacity (14). It 
would be important to know whether there is 
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a basic neurological difference, as measured by 
certain perceptual tasks, in a group of men- 
tally retarded Ss as compared with a group 
with average intellectual development. 

The first hypothesis of the present study is 
that there is a difference between the experi- 
mental group of institutionalized high-grade 
mentally retarded boys, both endogenous and 
exogenous, and a control group of normal 
high school students of approximately the 
same chronological age in capacity to per- 
ceive visual figural aftereffects. 

The second hypothesis is that there is a 
difference in the perceptual rigidity of experi- 
mental and control groups as defined by the 
capacity to shift figure and ground on a 
modified Rubin Vase-Profile test. 

The third hypothesis holds that perceptual 
rigidity varies inversely with the capacity to 
satiate. That is, low perceptual rigidity ac- 
companies high capacity to satiate, as would 
be expected on the basis of satiation theory 
as well as from the results of studies reported 
by Klein (6). 


METHOD 


Subjects. The experimental group for the Reversible 
Figures test consisted of 87 high-grade mentally re- 
tarded adolescent boys, ranging in age from 13 to 21 
years, all residents of a short-term state training school 
Of these, 37 completed the Figural Aftereffect test. 
The difference in number of Ss was due primarily to 
certain restrictions, as indicated below, in the control 
run of the aftereffect test. All but one of the Ss of the 
experimental group had been administered individual 
intelligence tests within a year of this study 

The control group for the Reversible Figures test 
consisted of 57 male students from a local high school 
who were closely equated in chronological age with the 
experimental group. Of these, 41 completed the Figural 
Aftereffect tests. All of these Ss had been administered 
the California Mental Maturity Test at time 
during their high school stay.2 A comparison of the 
mean chronological ages and IQs of the experimental 
and control groups is given in Table 1. Ss were chosen 
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at random within a specified age stratum. As can be 
seen from Table 1, the two groups were essentially the 
same in chronological age, but intelli- 
gence were significant well beyond the .001 level. 
Tests. Two tests were administered to all subjects 
(a) a modification of the Rubin Vase-Profile Reversible 
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TABLE 1 

IQ AND CHRONOLOGICAL AGE 

MENTALLY RETARDED (EXPERIMENTAL) 
NORMAL (CONTROL) GROUPS 


DISTRIBUTION OF 
AND 


Mean 
Age 


(Yrs.) 


Mean ) 
SD IQ SL 


Figural Aftereffects 
Test 
Experimental 
Control 
Reversible Figures 
Test 
Experimental 
Control 


17.34 
16.94 | 1 


1.69,66.11* 
32 96. 80* 


17.08 | 1.80/61.52 
16.81 | 1.2895. 


* Differences significant at the .001 level 


Figure test and (6) a Visual Figural Aftereffect test. 

A Rubin Vase-Profile Figure, 144 inches high, 
sketched in black India ink on a plain, white three-by- 
five card, was presented to each S, who was then 
asked, “What does this look like?” If the S gave a 
response to either the center (vase) or the side (profile) 
areas, he was further asked, “Does it look like any 
thing else?” until he either gave a response to the oppo- 
site area or said he could see nothing more. If he volun 
tarily shifted to the opposite area (center to side or 
side to center) he was given a score of four points, and 
this phase of his testing was complete 

If S could not voluntarily shift, a series of modified 
Rubin Vase-Profile Figures was substituted for the 
original more ambiguous figure. These modifications, 
following Harrower (4), were used in an attempt to 
force a perceptual shift from figure to ground. If S had 
initially responded only to the side or only to the center 
area, he was presented with additional figures which 
were progressively more structured in the opposite 
area. Each of these additions presumably gave the 
ground area more substance as a figure, and therefore 
made it more likely to be perceived 

As soon as S gave a figural response to the area that 
had been the ground for his initial percept, the test was 
halted. Four points were scored for a “voluntary” 
shift and three, two, and one point, respectively, for 
shifts on the progressively more structured cards. If 
S could not shift even after having been presented with 
the final, most structured card, his score was zero. A 
zero score was also given to any S who could not per- 
ceive any figure on the initially presented reversible 
figure, a score which turned out to be applicable to 
only one control and one experimental S. Thus, each S 
could score from zero to four, depending on his capacity 
to shift. The lower the score, the greater S’s perceptual 
rigidity was estimated to be 

The Inspection (I) Figures and Test (T) Figures 
used in the present experiment for figural aftereffect 
followed those reported by George (3) to be the most 
likely to elicit the aftereffect of change in size. All 
figures were drawn in black India ink on white showcase 
board. The I card consisted of two outline circles of un- 
equal size, the center of the smaller circle being three 
inches above, and the center of the larger circle being 
three inches below, a small fixation dot. The diameter 
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of the upper circle was 174, inches, and of the lower 
circle 34g inches. The fixation dot was {¢ of an inch 
in diameter. The white ground visible to the observer 
was 12 inches high and 18 inches wide. Ss were seated 
72 inches from the apparatus, which was at about eye 
level. An adjustable chin rest was used throughout. 

The T card consisted of two equal outline squares. 
Each was four square inches, with the center of the 
upper square three inches above the fixation dot and 
the center of the lower square three inches below the 
fixation dot. The T card could be dropped in front of 
the I card in such a way that the lower square fell in 
the area previously enclosed by the contours of the 
lower (larger) circle, while the upper square bounded 
the contours of the previously exposed upper (smaller) 
circle. According to satiation theory, fixation of the I 
card should result in an aftereffect in which the upper 
square of the T card appears larger than the lower one. 

The Ss were randomly divided into Subgroups I 
and II for the purpose of varying the sequence of 
fixation periods. The experiment began for each S 
with a control test. The T card was exposed, S was 
told to keep his eyes on the fixation dot and asked, 
“With your eyes still on the dot, tell me right away: 
Is one box bigger than the other or are the boxes the 
same size?’”’ These instructions were given before every 
judgment except that the question, “Are the boxes 
the same size?”’ alternated position with the question, 
“Ts one box bigger than the other?” in a prearranged 
manner. If S reported during the control test that the 
squares were unequal, he was immediately disqualified 
from the Figura! Aftereffect experiment. This was to 
eliminate possible contamination of the dependent 
variable by poor discrimination. 

If he passed the control test by reporting the two 
squares as equal, the I card was exposed for a pre- 
scribed length of time, with S instructed to fixate on 
the center dot. Each S was assured during this fixation 
period that he could blink his eyes as much as he 
wanted to, but he was reminded to keep his eyes on the 
dot. Ss of Group I fixated for 1 minute, after which the 
T card was immediately dropped in front of the I 
card and a size judgment requested. As soon as a judg- 
ment was made, the I card was again exposed, but this 
time for a 14-minute fixation period, after which 
another judgment of the size of the squares on the T 
card was requested. The apparatus was then screened 
from view and a 2-minute rest period ensued. After 2 
minutes, S resumed his previous position, and another 
judgment of the T card was made. This last judgment 
was a test of the persistence of the aftereffect. 

Group II went through exactly the same procedure, 
except that their inspection times were 2 and 2! 
minutes respectively, and their rest period was 4 min 


> 


utes 

After no less than three weeks, all Ss were retested 
At this second testing, Group I went through Group 
II’s original procedure, and vice versa. If any S failed 
the control test on this second testing, he was excluded 
from the study. Each S who completed the experiment 
had to go through four fixation periods of varying 
lengths of time and make eight judgments: two control 
judgments, four postfixation judgments, and two per 
sistence judgments. 


Scores were obtained by giving .25 for each post- 
fixation judgment at which an S reported the top square 
as bigger than the bottom one. The highest possible 
satiation score, then, was 1.00. On the two persistence 
trials a score of one was given for each positive satia- 
tion judgment. A total of two would be the highest 
possible persistence score. 


RESULTS 


The first hypothesis of the present study 
was that there is a difference between mental 
retardates and normals in the capacity to per- 
ceive visual figural aftereffects. This hypothe- 
sis was supported by the results (see Table 2). 
The mean satiation score for the control group 
was .591 and for the experimental group, .189. 
The variance of the control group, however, 
was significantly greater than the variance of 
the experimental group, producing a signifi- 
cant F value of 2.502. Since the standard ¢ test 
assumes homogeneity of variance, an approxi- 
mation ¢ was computed to test the difference 
between the means when the variances of the 
two groups are significantly different (1, p. 
167). The computed ¢ value of 5.289 was sig- 
nificant at the .001 level. 

To determine whether this difference might 
be related to etiology within the mentally re- 
tarded classification, the experimental group 
was further subdivided into two subgroups: 
an exogenous group of 16 Ss, based on psycho- 
logical and neurological diagnoses, and an 
endogenous group of 21 Ss who showed no 
evidence of brain damage. Differences between 
the mean satiation scores of these two sub- 
groups yielded a ¢ of 1.329, which did not 
reach the .05 level of significance. 

Tables 3 and 4 show the number of Ss of 
each group who satiated, and the chi square 
values representing the degree of relationship 
between intelligence and the ability to satiate 
at each postfixation period and after each 

TABLE 2 
DIFFERENCES BETWEEN THE MEAN SCORES OF 
MENTAL RETARDATES AND NORMALS ON Two 


PERCEPTUAL TESTS 


Figural Aftereffect Test | Reversible Figures Test 


NV |Mean SD N \Mean| SD t 


Normals 41 591 405 a7 57 | 3.105 .913 a 
Retardates 37 .189 256 7 | 2.563) .915 


* Differences significant at the .001 level, two tailed test 
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TABLE 3 
PERCENTAGE OF “SATIATION 
AFTER EACH FIXATION PERIOD 


NUMBER AND 
RESPONSES” 


Fixation Time 


144 min 2 min 244 min 
26 (63%) 63%) 22 (54%) 
6 (16%) 10 (27%) 6 (16%) 9 (24%) 
17.91°* 6.9° 


Normals 25 (61% 26 
Retardates 


x? 17.91°* 


| 


9.06° 
* Chi squares significant at the .01 level, two tailed test 
** Chi squares significant at the .001 level, two tailed test. 


TABLE 4 
PERCENTAGE OF “SATIATION 
AFTER Two Rest PErRiops 


NUMBER AND 
RESPONSES” 


Rest Period 


4 min. 
7 (17%) 6 (15%) 
7 (19%) 7 (19%) 


.26 (NS)* 


2 min. 


Normals (41) 
Retardates (37) 
x? .04 (NS)* 


* Chi square not significant at .05 level 


two- and four-minute rest period. It is obvious 
from these results that the mental retardates 
are far less able to satiate immediately after 
any of the four fixation periods than are the 
normals. Tests of the persistence of the after- 
effects indicate that in those retardates who 
did satiate, the effects of satiation did not 
dissipate as quickly as they did in normals. 

Another important finding is the general 
tendency for the mentally retarded group to 
require longer fixation times before satiating. 
In other words, the mentally retarded Ss of 
the present study show a limited capacity to 
satiate, but those who do satiate require 
longer fixation periods to do so, and their 
aftereffects tend to persist longer, relative to 
original satiation levels, than do the after- 
effects of normals. After the second period of 
testing, that is, after the second postfixation 
judgment on a particular day, there appears 
to be a satiation build-up in the mental re- 
tardates. This occurs whether or not the 
second period falls on the day of the shorter, 
1- to 1)4-minute test periods, or on the day 
of the longer, 2- to 244-minute test periods. 

It should be noted, however, that three 
normal Ss, who had satiated on their last 
persistence test, continued to report the upper 
square as larger on their second control test 
given three weeks later, and therefore had to 
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be dropped from the figural aftereffect phase 
of the study. These were the only Ss who gave 
this type of result, but the possibility must 
remain that for them the effects of satiation 
persisted over a three-week period of time. 
That aftereffects may possibly have persisted 
over an extended period for only three of the 
Ss would suggest that this phenomenon, if it 
exists at all, is a rare one (9). 

The second hypothesis, predicting a differ- 
ence in perceptual rigidity between normals 
and retardates, was also confirmed. On the 
modified Rubin test, the mean score for the 
control group was 3.105, and for the experi- 
mental group it was 2.563, a difference signifi- 
cant at the .001 level (see Table 2). If it can 
be assumed that this Reversible Figure test 
is a measure of perceptual rigidity, then cer- 
tainly the mentally retarded Ss are far more 
rigid than the control group. Differences be- 
tween the 16 exogenous and 21 endogenous 
retardates on the Rubin test did not reach 
the .05 level of significance, suggesting that 
these two etiological groups perform similarly 
on the particular perceptual tasks under study. 

Finally, a correlation of .33 between the 
scores on the Visual Figural Aftereffects test 
and the scores on the Reversible Figures test, 
combining the scores of both the experimental 
and control Ss, is significant beyond the .01 
level. This correlation must be considered 
cautiously, since it included a heterogeneous 
population. In fact, using only the 41 normal 
Ss, a correlation of .28 falls between the .1 
and .05 significance levels. Nevertheless, it 
lends some support to the third hypothesis 
that perceptual rigidity varies inversely with 
the capacity to satiate. The explanation for 
the positive correlation supporting an hypoth- 
esis of inverse relations is that a high score on 
the Reversible Figures test is indicative of 
low perceptual rigidity, whereas a high score 
on the Aftereffects test indicates high satia- 
bility. This correlation further suggests that 
there may well be a common factor underlying 
the capacity to shift figure and ground, even 
without fixation, and the capacity to perceive 
visual figural aftereffects. 


DISCUSSION 


Although the low scores of the retarded 
group on the aftereffect test are assumed to be 
due to poor satiability, there are three other 
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possibilities which must be taken into account. 
(a) The retardates were unable to fixate as 
well as normals and therefore did not perceive 
the aftereffects. This possibility was greatly 
decreased, if not entirely eliminated, by 
having an experimenter continually observe 
the S in order to remind him, if necessary, to 
continue to fixate. If fixation was impossible, 
the S was dropped from the study. (6) It is 
possible that the retarded Ss perceived the 
lower square as smaller after fixation, but 
reported the two squares as equal. In order to 
assess this possibility, a study is planned in 
which the Ss will be conditioned to the after- 
effect. In this way, the response will not be 
dependent upon a verbal report. (c) It may 
be that the mentally retarded satiated to the 
same degree as did the normals, but were 
unable to make fine size discriminations. In 
this regard, one additional set of data which 
seemed highly significant resulted from the 
study. Excluding those 3 Ss who may have 
persisted in satiating over a three-week period 
of time, 11 out of 52 normals inaccurately 
perceived two equal squares as being unequal 
on one of the two control tests. Of the men- 
tally retarded, on the other hand, 44 out of 81 
failed on one of the two control runs. This 
suggests that the mentally retarded are far 
poorer than normals in the ability to make 
size discriminations. This lead was followed a 
step further. The mean IQ of the 37 retard- 
ates who had successfully completed the two 
control tests was 66.11. The mean IQ of the 
44 retardates who failed at least one of the 
control tests was 57.07, a difference significant 
at the .01 level. The mean chronological ages 
of these two groups were not significantly 
different. These preliminary data would seem 
to indicate that intelligence is intimately re- 
lated to the capacity to make accurate size 
discriminations. 

The results of the reversible figure experi- 
ment, indicating that retardates as a group 
show greater perceptual rigidity, is in accord 
with what might be expected from the results 
of the figural aftereffect experiment. The 
measure of perceptual rigidity used in the 
present study may be just one measure of a 
higher order trait, much as Klein’s “‘leveling”’ 
and “sharpening” groups were isolated on the 
basis of their performances on particular per- 
ceptual tasks (6). This trait may, in turn, be 


the manifestation of a general cortical 
capacity for change which Wertheimer (18) 
has labeled “brain modifiability” and which 
Klein and Krech (7) have called “cortical 
conductivity.” If this is the case, the high- 
grade mental retardates of the present study 
exhibit a noticeable limitation in this area, a 
limitation that is not confined to exogenous 
or endogenous types (2, p. 41). These results 
lend support to the contention of Lewin (12) 
that retardates exhibit more rigid behavior 
than do normals, at least when matched for 
chronological age. 

The results of the present study may be 
compared with those reported by Wertheimer 
and his associates (17, 18), who found that 
schizophrenics show poorer satiability on tests 
of both visual and kinesthetic aftereffects. The 
interpretation which suggests itself is that 
high-grade mental retardates show a resistance 
to cortical change similar to that found in 
schizophrenics. Anyone who has attempted to 
modify schizophrenic thinking and behavior 
by means of psychotherapy, and who has also 
had experience in teaching the mentally re- 
tarded, should have little difficulty in seeing 
some merit in this interpretation. Wertheimer 
attributed his results to a lowered efficiency of 
tissue metabolism in the schizophrenics. 


SUMMARY 


A group of institutionalized high-grade men- 
tally retarded adolsecent boys was compared 
with a group of normal boys, equated for 
chronological age, on two perceptual tasks. 
The results were interpreted in terms of Kéh- 
ler and Wallach’s Theory of Satiation. The 
group of retardates showed a significantly 
poorer capacity to satiate, as measured by a 
Visual Figural Aftereffect test. Tests of the 
persistence of the aftereffects indicated that 
the effects of satiation do not dissipate as 
rapidly in retardates as they do in normals. 

The mentally retarded Ss manifested sig- 
nificantly greater perceptual rigidity, as de- 
fined by a modified Rubin Vase-Profile 
Reversible Figures test, than the normal Ss. 
No significant differences were found between 
endogenous and exogenous retardates on either 
test. 

A significant correlation was found between 
scores on the Visual Figural Aftereffects test 
and the Rubin Vase-Profile Reversible Fig- 
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ures test, suggesting a common factor under- 
lying both perceptual rigidity and limited 
capacity to satiate. 
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GENERALIZATION OF CHILDREN’S PREFERENCES AS A FUNCTION 
OF REINFORCEMENT AND TASK SIMILARITY! 


HAVA BONNE GEWIRTZ 


University of Maryland 


His study deals with the effects of cer- 

tain social reinforcement conditions on 

the stimulus generalization gradients 
they produce in children. There have been 
several recent attempts to apply the concepts 
of generalization and reinforcement to com- 
plex human behavior. Among these, Miller’s 
(7, 8) theoretical models for conflict and dis- 
placement phenomena are central. Yet the 
experiments upon which they are based most 
often involved infrahuman organisms, physio- 
logical reinforcement conditions, and a single 
dimension of stimulus similarity. And while 
some studies (4, 12) dealing with complex 
personality variables have incorporated Mil- 
ler’s conflict theory, generalization in these 
typically was inferred from assumptions about 
similarity between objects or situations which 
could not in themselves be tested because of 
situational or temporal distance. 

In this study, an attempt is made to apply 
the concepts of generalization and reinforce- 
ment to a relatively complex human situation 
in which the reinforcement conditions are 
manipulated experimentally. An experiment 
was designed to investigate children’s prefer- 
ences for a series of problem-solving tasks, as 
a function of success and failure experiences 
(positive and negative reinforcers) associated 
with a training task, and as related to the 
tasks’ degree of similarity to the training task. 
It was expected that if positive reinforcement 
followed the solution response to the training 
task, the differential preference values as- 
signed to the tasks—ordered along the 
similarity dimension—would represent an ap- 
proach gradient of preference (i.e., preference 
for the tasks would increase with their in- 
creasing similarity to the training task); and 
that negative reinforcement applied to the 
training task would produce an avoidance 


1 This paper is based on a portion of a doctoral 
dissertation submitted to the Department of Psy- 
chology of the University of Chicago. The writer wishes 
to express her appreciation and gratitude to Helen 
L. Koch and to Lyle V. Jones for their guidance in 
the course of this study. 
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gradient (i.e., preference would increase with 
increasing dissimilarity to the training task). 
In addition, it was intended to compare the 
slopes of the two resultant gradients, for a 
fundamental assumption of Miller’s conflict 
model is that avoidance gradients are steeper 
in slope than are approach gradients. 

Studies of preferences and similar responses 
(3, 5) have suggested that such behaviors 
could be acquired according to the laws of 
conditioning and reinforcement. But it has 
been demonstrated also (2, 10) that complex 
social reinforcement experiences such as suc- 
cess and failure may have different implica- 
tions for subjects with different reinforcement 
histories relevant to the treatment variable, 
and that these differences may be reflected in 
the effectiveness of experimental treatment 
conditions. This consideration has been taken 
into account in the design and interpretation 
of this study. 

METHOD 
Subjects 

One hundred children in the first and second grades 
of a university laboratory school served as subjects 
(Ss). They ranged in age from 6-1 to 8-0 years, and 
their median IQ score was 130.2 Children who were 


considered behavior problems by their teachers and 
those who were unwilling to participate were excluded. 


Materials 


A puzzle-solving situation was employed. The ma- 
terial consisted of five Masonite formboard-type puz- 
zles. While all five puzzle frames with their respective 
covers were of equal size (11 X 8% inches), the 
diamond-shaped depression within each frame (which 
constituted the puzzle proper) represented different 
points along a dimension of shape similarity (Fig. 1). 
The puzzles at the extremes of the dimension (1 and 5) 
were used as training puzzles. Each was equipped with 
two sets of seven triangular plywood pieces to be fitted 
into the puzzle depression: one was the easy set de- 





21Q scores, most of which were based on the Re- 
vised Stanford-Binet, were obtained from the school 
records. A few scores based on different tests were well 
above or below the median value, so that it is unlikely 
that the median would have been altered had all Ss 
been given the Binet. 
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Fic. 1. Puzzte Smaarity DIMENSION 
(Puzzles 1 and 5 [the training puzzles] are open and show the “difficult” sets of pieces as placed correctly 


inside 


signed to insure successful solution, the other was the 
difficult set designed to produce failure. While both 
sets were similar in general appearance, the asymmetry 
of the pieces in the difficult set made solution prac 
tically impossible for Ss of the ages employed. The 
frames and covers of all five puzzles were painted a 
uniform grey, while the depression in each puzzle as 
well as all the pieces in the four sets were painted a 
bright red. In addition, the outline of each puzzle 
depression was reproduced in red on top of the puzzle’s 
cover. In this way, Ss could observe the shape simi 
larity dimension when the puzzles were covered, with- 
out the opportunity to discover clues about the correct 
placement of the puzzle pieces inside. 


Experimental Conditions 


Three major conditions were employed: positive 
reinforcement, negative reinforcement, and control 
(no reinforcement). The reinforcement conditions 
represented a combination of three elements. Positive 
reinforcement consisted of (a) objective success in the 
puzzle-solution attempt, (6) the experimenter’s (£’s) 
approval (e.g., “Good,” “That was fine!”), and (c) 
S’s winning a material reward or prize (a small plastic 
trinket of the kind found in gumball vending ma- 
chines). Negative reinforcement consisted of (a) objective 
failure in the solution attempt, (6) £’s reproof (e.g., 
“That wasn’t too good,” “Uh-uh’’), and (c) the with- 
holding of the material prize. The three elements under 
each condition were combined in order to maximize the 
effects of the experimental treatment as well as to 
minimize individual among Ss in their 
susceptibility to the different kinds of reinforcers in 


differences 


volved. 

The sample was divided into five groups of 20 Ss 
each. With the exception of stratification according to 
sex and grade,* the assignment of Ss to the five groups 


* This stratification is ignored in the presentations 
which follow, because neither age nor sex was found to 


be related to the experimental results 


Puzzles 2, 3, and 4 are closed and their outlines are shown on top of the covers.) 


was made at random. These groups did not differ sig- 
nificantly in age or intelligence test scores. Of the five 
groups, two received positive reinforcement, two 
received negative reinforcement, and one served as 
the contro]. Of each two groups receiving the same rein- 
forcement condition, one was trained on Puzzle 1 
and the other on Puzzle 5. The two training puzzles 
were employed under each reinforcement condition 
as a crude contro! for the possibility that puzzles 
might be differentially attractive due to specific shape 
characteristics rather than to the effects of the experi- 
mental treatment. Thus, in addition to the control 
group, there were four treatment groups representing 
the four combinations of reinforcement condition and 
training puzzle: Positive 1, Negative 1, Positive 5, and 
Negative 5 


Procedure 


The experimental session, in which Ss were seen 
individuaily for a period of approximately 20 minutes, 
was comprised of three consecutive phases: the dimen 
sion-training phase, the experimental treatment 
phase (omitted in the case of control Ss), and the paired- 
comparison testing phase. The dimension-training 
phase served to familiarize Ss with the puzzle similarity 
dimension. S was shown the five puzzles, al! covered 
and arranged in a row in a random order. He was 
asked to rearrange them in terms of their similarity 
to each other and to verbalize the relevant dimension 
characteristics. While no S failed to rearrange the puz 
zles correctly, some were unable to verbalize the basis 
for their arrangement. In such cases, £ provided the 
relevant information (e.g., “See, they become longer 
and longer”). This procedure was adopted to insure 
that Ss were equally aware of the shape similarity di- 
mension. 

The experimental treatment phase followed when S 
was led to a separate table and presented with the ap- 
propriate training puzzle (either 1 or 5). He was urged 
to trv to solve the puzzle and was told he would win a 
litt! ize for each successful solution. At this point S 
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was presented with a box containing about 20 prizes 
and was permitted to select and put aside his three 
favorites. To convince skeptical Ss that solution was 
possible, E preceded the treatment by a brief demon- 
stration, employing the set of pieces on which S was 
to be trained. This was carried out so rapidly that 
there seemed little danger that S would learn the 
difficult pattern or lose interest in the simple one. 
Actual treatment began only after this demonstration, 
and consisted of three trials of puzzle solution. Under 
the positive reinforcement condition, S was provided 
with the easy set of pieces. After each of the three 
successful trials he was praised by E and allowed to 
keep one of the three prizes. Under negative reinforce- 
ment S was provided with the difficult set of pieces. 
A trial under this condition was defined as an unsuc 
cessful solution attempt, when S gave up either spon- 
taneously or following E’s suggestion to end it and 
“start it all over again.’”’ Each of the three failures was 
accompanied by reproof and the removal of one of the 
previously selected prizes.‘ 

The paired-comparison testing phase followed the 
last training trial, when S was asked whether he would 
like to play some more, but this time with a puzzle of 
his own choice. The suggestion was welcomed by every 
S. E then presented to S, successively, the 10 different 
possible pairs of the five puzzles. During the presenta- 
tion of each pair, S was asked to point to the puzzle 
with which he would prefer to play. The sequence of 
pair-presentation was random except that each puzzle 
appeared an equal number of times in the right and 
left position in the pair. When the presentation of all 
10 pairs was completed, S was asked to give the reason 
for his preference choice (“Why did you pick those?’’). 
Control Ss, who were not subjected to the treatment 
procedure, were presented with the paired-comparison 
test immediately following the dimension-training 
phase. 


RESULTS 


Approach and Avoidance Generalization Gra- 
dients 


The data were analyzed first by means of 
the rank analysis method (1, 11), which en- 
abled testing two null hypotheses: (a) puzzle 
preference equality (i.e., that in each of the 


‘Five of the positively reinforced Ss were noticed 
to have considerable difficulty in solving the easy puz- 
zle; and three of the negatively reinforced Ss almost 
succeeded in solving the difficult one. These few cases, 
however, did not change the results and hence are not 
discussed further 

5 Although the experimental session was concluded 
at this point, Ss in the control and negative groups 
were now permitted to play with the puzzle of their 
choice, to solve it successfully, and to win the three 
prizes. This was done in order to temper somewhat 
the experiences of all Ss before their return to their 
classrooms, and thus to prevent harmful rumors 
about failure or loss of prizes from reaching children 
who had not yet had their turn as Ss. 
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five groups there were no significant differ- 
ences among the preference rankings assigned 
to the five puzzles); and (b) agreement between 
groups (i.e., that there were no differences 
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TABLE 1 
Cut SQUARE VALUES FOR THE RANK ANALYsiIs TESTS 
PuzzLeE PREFERENCE Eouatity Witutn Eacu 
Group AND AGREEMENT IN PREFERENCE 
PATTERN BETWEEN EacH TREATMENT 
GROUP AND THE CONTROL GROUP 


Agreement 
with Control 


Puzzle Preference 


Group Equality 


4.24 


Control 

Positive 1 5 
Negative 1 90 
Positive 5 12 
Negative 5 


between each experimental group and the 
control group in terms of their respective 
preference patterns). 

Figures 2 and 3 show the composite prefer- 
ence gradients (based on preference propor- 
derived from the rankings) obtained 
groups trained on Puzzle 1 and 
respectively. The control group 


tions 
from the 
Puzzle 5, 


curve is the same in both figures. Table 1 
presents the results of the tests of equality and 


agreement. As shown, the expectations ad- 
vanced were fully confirmed only in the case 
of the two negative groups: each exhibited a 
reliable avoidance gradient which differed sig- 
nificantly from the preference pattern of the 
control group. In contrast, the results obtained 
from the two positive groups were less con- 
clusive: only Positive 5 exhibited significant 
departure from puzzle preference equality, yet 
its preferenc e trend was not a simple function 
of the similarity dimension; and 
neither positive group differed significantly 
from the control group. In addition, response 


puzzle 


pattern variability within the positive groups 
was considerably greater than it was within 
the negative groups, suggesting that a con- 
siderable number of Ss were unaffected by the 
positive treatment condition. This variability 
was unrelated either to age or sex of Ss, and 
appeared to be a function of randomly dis- 
tributed individual differences. 

To examine the nature of these differences, 
it was necessary to determine the number and 
type of preference response patterns given by 
individual Ss. Since this information could not 
be provided by the rank analysis method, 
another analysis was undertaken. It was based 
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TABLE 2 
NuMBER OF Ss Exuisitinc INDIVIDUAL GRADIENT 
PATTERNS IN EACH GROUP, AND DIFFERENCE 
BETWEEN TREATMENT AND CONTROL GROUP 
FREQUENCIES 


Gradient Pattern , 
‘ , Non Difference 


gradient from 


Ap Avoid Pattern Control* 


proach ance 


Total 


Control 5 15 
Positive 18 f 26 14 
Negative 0 3 30 10 


* x? corrected for discontinuity, two-tailed test (1 df) 


*» < 1 


on frequency distributions of Ss classified into 
two major categories: those exhibiting “gra- 
dient” patterns, and those exhibiting ‘“non- 
gradient” patterns. S was said to exhibit a 
gradient pattern when the sum of his prefer- 
ence ranks represented the values 4, 5, 6, 7, 8 
(or 8, 7, 6, 5, 4), respectively, for the five 
puzzles arranged in terms of the similarity 
dimension.’ While this particular pattern rep- 
resented the “perfect” individual gradient 
possible within the paired-comparison scoring 
method, slight deviations from this perfect 
order were included also in the gradient pat- 
tern category (e.g., a single reversal in adja- 
cent puzzles; one or two ties on adjacent 
puzzles provided the order of preference was 
still maintained in relation to the similarity 
dimension). All Ss whose sums of ranks did 
not satisfy these criteria were classified in the 
nongradient category. 

Table 2 presents the frequency distribution 
of Ss falling within the gradient and nongra- 
dient categories, including also classification 
by gradient direction (i.e., approach or avoid- 
ance relative to the training puzzle). Because 
the different training puzzles did not produce 
different response gradients in Ss, in this table 
and in all subsequent ones the data are pre- 
sented for the combined positive and the 
combined negative groups, with 40 Ss in each. 
As shown, the proportion of positively and of 
negatively reinforced Ss who exhibited gradi- 

§ These values are based on the paired-comparison 
scores, where the preferred member in the pair re 
ceived the rank of 1 and the rejected member received 
the rank of 2. Each of the five puzzles appeared a total 
of four times in the ten-pair presentations. Hence, for a 
single S, the puzzle always preferred would receive the 
minimal total rank of 4, and the puzzle never preferred 
would receive the maximal total rank of 8 
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ent patterns was significantly greater, in each 
case, than that proportion among the control 
Ss. While it is seen that the treatments pro- 
duced generalization gradients, this analysis 
reveals also that the positive reinforcement 
condition produced two kinds of gradients: 
approach and avoidance. This finding could 
explain the over-all similarity between the 
control and the positive groups shown in the 
rank analysis, since the individual gradients, 
opposite in direction, appear to have cancelled 
each other in the composite group scores. 


Factors Underlying the Effects of Positive Re- 
inforcement 

The question that still remained, however, 
was: why did some of the positively reinforced 
Ss exhibit avoidance gradients, which had been 
expected only under negative reinforcement? 
Since the experiment consisted of a goal-at- 
tainment situation, it was assumed that the 
variations found in Ss’ response patterns re- 
flected individual differences in the intensity 
of their involvement in success, i.e., in their 
achievement motivation. Two classes of in- 
formation were available in this study for the 
purpose of testing this assumption: (a) the 
reasons given by Ss for their preference 
choices, and (6) their JQ level. These two 
variables were taken, each to be a plausible 
indicator of Ss’ strength of achievement mo- 
tivation (for reasons to be noted subse- 
quently). 

Of the various types of reasons given by Ss, 
two indicated involvement in achievement: 
preference for easy tasks (e.g., “Because it 
looks easier’, ‘“The others are too hard for 
me’’); and preference for difficult tasks (e.g., 
“T think it is harder, I like to try it’’, “Harder 
is more fun’’). These two types of reasons, 
Ease and Difficulty, were given by more than 
50% of all experimentally reinforced Ss, but 
only by 10% of control Ss. For IQ level, Ss 
were divided into two groups: the “High” IQ 
group consisted of Ss whose scores were above 
the over-all sample median of 130, and the 
“Low” IQ group consisted of those with 
scores below and including that median. It 
was postulated that the avoidance gradients 
generated by the positive reinforcement con- 
dition were an outcome of strong achievement 
motivation, and as such would be exhibited 
more frequently by the more highly achieve- 
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TABLE 3 
NUMBER OF Ss IN THE Positive Group CLASSIFIED 
IN TeRMS OF GRADIENT Direction, IQ LEVEL, 
AND TYPE OF REASON 


Gradient Pattern and Direction 


Non- 
gradient 


Avoidance 
Gradient 


Approach 
Gradient 


Reason _ zs & @ 


Ease 3 0 1 
Difficulty 4 0 1 
Other . 0 6 6 


Total 7 6 8 


Note—“L” and “H” refer to Low and High IQ groups, re 


spectively. 


ment-motivated Ss, who were expected to be 
in the High IQ group and to give Difficulty as 
their reason. Table 3 presents the frequency 
distribution of the 40 positively reinforced Ss 
in terms of these three variables. When the 
relationships were tested by means of exact 
one-tailed tests for fourfold tables (6), the 
following results were obtained, all at p < .01: 
(a) Ss who exhibited avoidance gradients 
tended to give Difficulty as their reason (5/8), 
while Ss who exhibited approach gradients 
tended to give Ease as their reason (11/18); 
(6) Ss who exhibited avoidance gradients were 
predominantly in the High IQ group (7/8), 
while Ss who exhibited approach gradients 
were mostly in the Low IQ group (15/18); 
(c) when IQ level and type of reason were 
employed as a joint criterion, it was found 
that all 10 Ss who were simultaneously in the 
Low IQ and Ease categories exhibited ap- 
proach gradients, while all four Ss who were 
simultaneously in the High IQ and Difficulty 
categories exhibited avoidance gradients. 
Thus, the postulated relationships between 
gradient direction, IQ level, and type of 
reason appear to be supported; and positive 
reinforcement was shown to generate quite 
different gradient patterns depending on Ss’ 
strength of achievement motivation. 


Relative Steepness of Slope 

The examination of the relative steepness of 
slope of the generalization gradients produced 
by positive and negative reinforcement in- 
volved classification of individual gradient 
patterns in terms of their degree of steepness. 
The steepest gradient possible under the 
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TABLE 4 


NUMBER OF Ss IN THE Two REINFORCEMENT GROUPS 
EXHIBITING PERFECT AND IMPERFECT APPROACH 
AND AVOIDANCE GRADIENTS 


Approach Avoidance 
Gradient Gradient Total Gradient 
Group 
Im Im Im- 
Perfect perfect Perfect perfect Perfect perfect 


Nm 
NR h 


8 22 8 
6 10 16 


Negative 0 0 
Positive x 10 


method employed was the “perfect” gradient, 
described earlier. All other gradient patterns 
were classified as “imperfect.” Table 4 pre- 
sents the distribution of Ss in the two rein- 
forcement groups in terms of this steepness 
criterion, and also according to gradient di- 
rection. Two-tailed (x? corrected for 
discontinuity, 1 df) were used. The first ques- 
tion addressed itself to the effects of each 
treatment condition on ignoring 
gradient direction. It was found that the pro- 
portion of steep gradients that were obtained 
after negative reinforcement (22/30) was 
significantly greater (p < .02) than the pro- 
portion obtained after positive reinforcement 
(10/26). When gradient direction was taken 
into account, however, it was found that 
perfect avoidance gradients were produced 
more frequently (p < .01) in the negative 
group (22/30) than in the positive group 
(2/8); but that no differences existed in the 
positive group between the proportion of its 
perfect avoidance gradients (2/8) and the 
proportion of its perfect approach gradients 
(8/18). These results indicate that, regardless 
of gradient direction, negative reinforcement 
produced more and steeper individual gradi- 
ents than did the positive reinforcement con- 
dition. 


tests 


steepness, 


DISCUSSION 


Achievement Motivation and Preference Gra- 
dients 


It has been suggested (e.g., 2) that children’s 
preferences for difficult goals are a function of 
strong achievement emphasis during sociali- 
zation. Similarly, when, following success, S 
verbalized preference for a difficult task in the 
present study, it was taken to indicate that he 
was more motivated to demonstrate out- 
standing achievement than was another S who 
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verbalized preference for the easy task on 
which success had already been experienced. 
The use of IQ level as the second index of 
achievement motivation based on the 
notion that ability is a significant determinant 
of the outcome of past encounters with difficult 
tasks. Highly intelligent children are likely to 
be rewarded for attempts to excel on such 
tasks, either because of actual success, due to 
superior ability, or because parental approval 
is often contingent upon their accomplishing 
more than “just doing what everybody else 
can do.” Hence, the highly able child should 
be more likely to acquire strong achievement 
motivation than the less able child. Another 
possible approach could be based on the notion 
that high IQ scores are, in part, an outcome of 
strong achievement motivation. The child who 
persists in his problem-solving efforts and re- 
fuses to give up easily is more likely to attain 
a higher score than one who is indifferent or 
gives up at first sight of difficulty. 

These assumed relationships, regardless of 
their causal direction, appear to be supported 
by the present experimental results. Mastery 
of the difficult task appeared to have a 
stronger reinforcing value to the highly 
achievement-motivated S$ than repeated suc- 
cess on the easy one. Hence, it would seem 
more correct to conclude that rather than 
generating approach and avoidance gradients, 
the positive reinforcement condition produced 
two kinds of approach gradients: one by the 
less achievement-motivated Ss, directed to- 
wards the goal of easy and safe success (the 
approach gradient); and the other by the 
highly achievement-motivated Ss directed to- 
wards the goal of outstanding achievement in 
terms of solving the most difficult appearing 
task (the avoidance-like gradient). This con- 
clusion seems supported also by the finding 
that these two kinds of gradients resembled 
each other in terms of the steepness criterion, 
regardless of direction. 


was 


Negative Reinforcement Generates the Steeper 
Gradient 


The limitations inherent in the paired-com- 
parison method’ restrict somewhat statements 


7 Because the lack of independent response meas- 
ures for each of the compared items, the effects of 
reinforcement on the response to each of the five 
puzzles is confounded with its effects on the discrimi- 
nation of the training puzzle from all others 





id 


m- 
nts 


of 
five 
mi 





GENERALIZATION OF CHILDREN’S PREFERENCES 117 


about the specific shape of the generalization 
gradients obtained. Nevertheless, the marked 
differences between the effects of the two 
treatment conditions suggest certain specula- 
tive interpretations. While the distinction be- 
tween primary and acquired drives, as 
suggested by Miller and Murray (9), does not 
fit the case of the present study, an analogous 
differentiation between “internal” and “ex- 
ternal” sources of motivation for the response 
might be applied here. The fact that no in- 
dividual differences could be detected in the 
uniform response pattern exhibited by the 
negative group suggests that such differences 
—though theoretically present—were over- 
shadowed by the powerful and unambiguous 
negative treatment condition. The avoidance 
response appears to have been contingent 
predominantly upon the external stimulus 
situation which, when changed, was readily 
discriminated and produced a decrease in re- 
sponse strength resulting in a steep generali- 
zation gradient. On the other hand, the 
variability in the positive group’s response 
pattern indicated that this treatment condi- 
tion was responded to by Ss_ primarily 
according to their individual aspirations and 
achievement needs. The “need to succeed” 
(being different perhaps for each S$) was 
relatively more constant and more _ inde- 
pendent of the external stimulus situation. 
Hence, changes in that situation did not bring 
about a marked, or at least uniform, reduction 
in response strength, making for flatter gen- 
eralization gradients than those produced 
under the negative reinforcement condition. 


SUMMARY 


An experiment was designed to study 
children’s preferences for a series of problem- 
solving tasks as a function of the particular 
reinforcement condition associated with a 
training task, and the degree of similarity of 
each task to the training task. One hundred 
first- and second-graders served as Ss. Five 
formboard-type puzzles were constructed such 
that their shapes constituted a similarity di- 
mension. The two puzzles at the extremes of 
this dimension served as training puzzles, each 
for one-half of a group receiving the same re- 
inforcement. Forty Ss received positive rein- 
forcement, consisting of successful puzzle 
solution, praise, and a material reward; 40 Ss 


received negative reinforcement, consisting of 

failure, reproof, and withdrawal of the reward; 

and 20 Ss were not reinforced, serving as a 

control group. Ss’ differential preferences for 

the five puzzles were obtained through sub- 
sequent paired-comparison presentations. 
The results were: 1. Following reinforce- 
ment, the differential preference rankings for 
the puzzles (arranged in terms of the similarity 
dimension) represented generalization gradi- 
ents of preference. But while negative rein- 
forcement uniformly generated avoidance 
gradients, positive reinforcement produced in 
some Ss approach gradients and in others 
avoidance gradients. 2. Those positively rein- 
forced Ss who exhibited avoidance gradients 
tended to have superior IQ, and to verbalize 
preference for difficult tasks. It was postulated 
that these Ss were more achievement-moti- 
vated than those who exhibited approach 
gradients, and that their response pattern 
actually represented approach towards a more 
challanging task. 3. When the slopes of the 
gradients were compared, it was found that 
negative reinforcement generated a larger pro- 
portion of steeper individual gradients than 
did positive reinforcement. It was suggested 
that this finding could be related to the dis- 
tinction between external and internal factors 
which determined the response. 
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HE past decade has witnessed a phenom- 

enal increase in theories and investiga- 

tions of the group pressures and 
interactions which results from discrepancies 
in opinion. Such differences, Festinger and his 
co-workers observed (13, 14, 15, 16, 32), re- 
sult in three distinct manifestations of pressure 
toward uniformity, each of which operates to 
reduce this discrepancy: a pressure on the in- 
dividual to alter his own opinion so as to bring 
it in line with his fellows; a pressure on the 
individual or group to attempt to influence 
discrepant individuals so as to bring their 
their opinions in line; a pressure on the group 
or individuals to redefine the boundaries of 
their group, increasing uniformity by reject- 
ing discrepant individuals. The results of these 
investigations have contributed considerably 
to our understanding of social influence. They 
have also indicated a number of problems 
which need additional investigation. 

First, we still have much to learn about the 
factors that increase pressures toward uni- 
formity in the group and of their effects on 
the individual in relation to his group. Group 
pressures have been shown to increase with 
cohesiveness (1, 16, 32), homogeneity (17, 19), 
and clarity of group goals and procedures (31), 
and to decrease when opinions are anchored 
in other groups (20, 25). Though rejection by 
the group has been shown to be a manifestation 
of pressures toward uniformity (32), there has 
been little investigation of the effects of fear 
of rejection on pressures toward uniformity.” 

! This study was conducted under contract with the 
Office of Naval Research while the author was a 
member of the staff of the Research Center for Group 
Dynamics, University of Michigan. The more com- 
plete dissertation upon which this report is based (29) 
was submitted to the faculty of the University of Michi- 
gan in partial fulfillment of the requirements for the 
Ph.D. degree. The author wishes to express his apprecia- 
tion to John R. P. French for his patient direction, to 
Leon Festinger for the initial inspiration, to Irwin Goff- 
man, and Stanley Thorley for their invaluable assistance 
in the analysis of the data. 

2 Since this investigation, there have been several 
reports of the effects of fear of rejection (9, 22, 24). 
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Secondly, there is still the question of the 
relationship between the social influences on 
opinion and influences on the content, per- 
ceived or communicated, which bears on that 
opinion. Duncker (10) may have been one of 
the first to observe that “prestige suggestion” 
might be the result of new information and 
new perceptions about the object of opinion, 
rather than blind conformity. Or as Asch (1) 
states it, prestige suggestion might be “a 
change in the object of judgment, rather than 
in the judgment of the object.” Asch, Block, 
and Hertzman (3) demonstrated experimen- 
tally that subjects who altered their evalua- 
tion of the profession of “politics” to conform 
to a majority evaluation also altered their 
perception of the connotation of the word 
“politics.” However, whether alteration of 
perception of the object of judgment preceded 
the alteration of the judgment, as the experi- 
menters suggested, or whether the changes in 
perception rather occurred after the change in 
judgment is not clear. The distinction between 
social influence on opinion and on content 
bearing on the opinion was, nevertheless, 
clearly established. Further investigation to 
determine relationship between these two 
types of pressure seems in order. 

Thirdly, the distinction has been made be- 
tween that change in the individual’s opinion 
which is evident to the group or “public” 
and that which is “private” (33). More re- 
cently there have been several attempts to 
delimit the conditions that result in private 
acceptance of a change in opinion as compared 
with those that result in public conformity 
without private acceptance (8, 14, 18, 25, 26, 
30). We would further expect that even though 
the person’s opinion may remain private, in- 
sofar as his group is concerned, there will be 
pressures to change toward the group, and 
these will be accelerated if he must communi- 
cate content that is related to the opinion. 

It is our purpose here to investigate a 
situation where the individual finds his own 
opinion discrepant from the clearly defined 
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norms of his group, but where he may keep 
his opinion private. We would vary the group 
pressures upon the individual by allowing 
some groups the opportunity to reject the 
individual for nonconformity. In addition, we 
wished to vary the extent to which his com- 
munications regarding the opinion would re- 
main private. Where he must communicate to 
his group material relating to the discrepant 
opinion, pressure to change should be in- 
creased. Furthermore, the group pressures 
would operate to distort the content relating 
to the opinion even though the opinion itself 
would remain private. These pressures would 
be greater with respect to content that is to be 
communicated to the group than upon content 
that the individual perceives. Pressures to- 
ward uniformity operating on opinion, on 
content communicated, and on content per- 
ceived should be increased when there is 
possibility of rejection for nonconformity. It 
was to examine such hypotheses that this 
experiment was designed. 


THE EXPERIMENT 
Proc edure 


There were from 10 to 14 undergraduates in each 
experimental group, a total of 344 subjects (Ss) in all. 
Each group was homogeneous with respect to sex and 
class status. The cohesiveness of each group was 
heightened by a procedure utilized by Back (4), by 
using only those Ss who had expressed an interest in the 
subject matter proposed for discussion—‘‘human rela- 
tions and social problems,” and by a short group discus 
sion. 

Each S was then given a juvenile case study to read. 
“The Case of Johnny Sandron” was adapted from an 
actual case study (12). Johnny had been arrested for the 
robbery and murder of an elderly lady. Interviews with 
Johnny, with his mother, and with his teacher were 
presented, each providing contradictory information 
about Johnny and his crime. When the initial reading 
had been completed, Ss were asked to indicate their 
opinion on a seven-point scale wherein they evaluated 


Opinion Scale Presented 


Position Actual Consensus Consensus 
1 
2 XXXXXXX 
3 A XX 
4 BC X 
5 DEFGH X 
6 IJKL xX 
7 M xX 


Fic. 1. A Typicat Opinion CONSENSUS FOR A GROUP 
oF THIRTEEN AND THE Fictitious CONSENSUS 
PRESENTED TO THAT GROUP 
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the extent to which they felt that Johnny was personally 
responsible for his crime: 

1. Johnny had many decent influences of which he 
could have taken advantage. The conditions under 
which he lived were friendly enough so that they can 
hardly be held responsible for his misdeeds. The 
blame for his crime must be placed entirely upon his 
shoulders. ...4. Johnny had both helpful and dis- 
turbing influences. The blame for his crime must be 
placed equally both on Johnny and on conditions in 
which he lived. . . . 7. Considering the terrible condi- 
tions under which Johnny lived, it would seem 
almost a miracle that he didn’t come out worse than 
he did. In spite of everything, Johnny’s behavior 
was still essentially good. Not he but his environment 
is entirely to blame for his misdeeds. 

It was emphaszied to the Ss that their opinion was 
stated privately, to the E only, and would not be 
communicated to fellow group members. 

The next operation was designed to make as many 
Ss as possible feel that they were deviates from a well- 
defined group norm: 

You are probably curious about how others in this 
group think about the extent to which Johnny is 
responsible for his crime. Therefore I shall place on 
the blackboard the number of people who have 
chosen each position. I shall indicate this with Xs 
so that it will not be necessary to identify each 
person specifically. 

Most Ss tended to choose opinions along the en 
vironmental responsibility end of the scale, and a false 
consensus was presented which indicated that most Ss 
occupied an opinion at the “personal responsibility”’ 
end of the scale (see Fig. 1). Thus, Ss who occupied lone 
positions at 5, 6, or 7 saw most of their group members 
at 2 or 3. Ss at 5, 6, or 7 will henceforth be called the 
deviates; those who initially held Positions 2, 3, or 4, 
will be called modes. The reaction to the fictitious con 
sensus was often one of shocked surprise. However, 
only a few Ss were suspicious of the manipulation, and 
these were readily detected and eliminated from the 
analysis of the data. 

Eventually, Ss were told that they would have to 
write a group report of the case, much as social workers 
must do. This would be a report upon which all must 
agree completely. A prize would be awarded to the 
group which submitted the best report. With this in 
mind, Ss were now asked to reread the case study, and a 
second statement of opinion was solicited. Again, the 
opinion statement was private, not to be seen by fellow 
Ss, and it was possible to compare the new opinion 
statement with the initial one to get a measure of the 
effects of group pressures on opinion. 

As further aid in writing their group report, Ss were 
next asked to write individually a description of the 
case as they saw it. A rough outline was provided by the 
E. The description was solicted by the E so that it 
might later be coded for group influences on content 
related to the opinion in question. 

Following the individual descriptions of the case, Ss 
were asked to state their private opinion for the third 
time. This was followed by a questionnaire. Then, 
instead of the group discussion and group report, a 
complete explanation of the research project was pre 
sented by the £, and the session was concluded. 
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The Public-Private Variation 


Though it was stressed to all Ss that their 
own opinions need never be stated publicly to 
the group, this did not hold for the descrip- 
tions of the case study which they wrote in- 
dividually. In the Public condition, Ss were 
told that their individual descriptions would 
be passed around so that all other group mem- 
bers could see them. To facilitate such 
distribution, carbon copies were made of the 
individual descriptions. In the Private condi- 
tion, the individual descriptions were not to be 
seen by anyone other than the E The purpose 
of the Private descriptions was presumably to 
aid Ss in formulating their own ideas. Thus in 
the Public condition, Ss were forced to com- 
municate opinion-relevant content to mem- 
bers of their group, whereas such communica- 
tion was not forced on Ss in the Private 
condition. Furthermore, the content of the 
Public descriptions would indicate group in- 
fluences on communication. The content of 
Private descriptions could be considered as 
indicating primarily group influences on per- 
ception or cognition. 


The Variation of Possibility of Rejection 


In order to observe the effects of differing 
degrees of pressure toward uniformity, we also 
varied possibility of rejection. Ss in the Re- 
jection conditions were told that, since it was 
especially important that members of the 
group get along well together, we were offering 
them an opportunity to reorganize the group 
so as to exclude those members with whom 
they could not get along well. After they had 
had an opportunity to discuss part of the case 
study, ballots would presumably be passed 
out on which they could indicate their candi- 
dates for rejection. The three individuals with 
the most rejection votes would then leave to 
take part in another research project. The 
Nonrejection Ss were told that all of them 
would remain for the entire group session. 

Both the Rejection and Public—Private 
variations were introduced at the very begin- 
ning of the experiment and were reinforced at 
various times throughout the session. Thus we 
had four experimental conditions: Public Re- 
jection, Public Nonrejection, Private Rejection, 
Private Nonrejection. Within each of these con- 
ditions, Ss were divided into modes and de- 


TABLE 1 
DISTRIBUTION OF SUBJECTS IN EXPERIMENTAL 
CONDITIONS 


Public Private 


Modes Deviates Modes Deviates 


Rejection 24 74 18 76 
Nonrejection 11 67 17 57 


viates according to whether they held initial 
opinions which were in agreement with the 
initial group norm. The number in each con- 
dition is shown in Table 1. 


Data Collection 

Three types of data were collected: opinion 
statements, questionnaire responses, and con- 
tent analysis of case descriptions. 

1. Data on change of opinion. Three private 
statements, regarding the extent to which 
Johnny was responsible for his crime, were 
obtained from the Ss. One was obtained after 
the first cursory reading of the case study. 
The second statement was made after the 
fictitious consensus had been presented and 
after a rereading of the case study. The third 
statement was made after the individual re- 
port had been written. Change of opinion, and 
direction of change (toward or away from the 
group norm), could be measured from a com- 
parison of these opinion statements. 

2. Ques..unnaire data. The questionnaire 
administered at the end of the experiment was 
constructed primarily to test the effectiveness 
of the experimental manipulations. 

3. Content analysis of individual descrip- 
tions. To gain information regarding the Ss’ 
tendency to select or distort content in re- 
sponse to group pressures, the content of the 
individual descriptions was analyzed. Each 
description was broken down into a number 
of meaning units or items. Each item was now 
coded in terms of whether it was “favorable” 
(supporting the end of the opinion scale 
which was favorable to Johnny), “unfavorable” 
(supporting a position unfavorable to 
Johnny), or “neutral” (not clearly sup- 
porting either end of the scale). Thus “fa- 
vorable” items tended to support the position 
actually chosen by most Ss; “unfavor- 
able” items tended to support the presumed 
group norm. After adding the number of 
items in each category for the individual 
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description, a “coefficient of imbalance” (CI) 
was calculated, subtracting the percentage of 
unfavorable responses from the percentage of 
favorable responses (23). An arcsine trans- 
formation was used to make the distribution 
of scores approach normality.’ Thus, the 
greater the CI, the more positive the content 
and, among deviates, the greater number of 
items supporting the S’s own initial opinion, 
favorable to the delinquent. The less positive, 
or more negative, the CI, the more the deviate 
shows evidence of selecting or interpreting the 
content so as to support the supposed group 
norm. 


The Effectiveness of the Experimental Manipu- 
lations 

The effectiveness of the Public-Private 
manipulation was tested with responses on 
the final questionnaire. One question asked, 
“How much did you feel that your opinion (as 
to whether Johnny or his environment was 
responsible) would affect the degree to which 
the others in the group would prefer to have 
you continue working with them on the case 
study?” with a six-alternative scale. The 
deviate Ss in Public conditions were much 
more likely to see a relationship between 
opinion and rejection than those in the Pri- 
vate conditions (p = .04, by chi square), 
indicating that Ss did indeed feel that their 
opinions would be more obvious if their de- 
scriptions were seen by others. However, 
another question, “What possibility is there 
that the other members of the group will find 
out how you personally feel about this case 
study?” showed no significant differences, 
perhaps because of the ambiguity of the phrase 

* The precise formula for the coefficient of imbalance 
was: 


. , j u“ 
CI = arc sine 


— arc sine — , 
f+utn f+utn 


where f = number of favorable items, « = number of 
unfavorable items, » = number of neutral items. The 
possible range of scores was from + 90.00 to —90.00. 
For all Ss, the mean CI was + 1.03, the SD was 19.3. 
A reliability check between two independent judges, 
based on 33 cases, and a total of 777 items, showed 
agreement on 79.8%, with only 2.9% in which one 
judge coded an item as positive and the other as 
negative. The correlation between the 33 pairs of CIs 
was .84. The validity of the CI was demonstrated by 
a strong relationship between CI and initial opinion, 
significant at the .001 level by chi square. 
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“how you personally feel” (p = .20, by chi 
square). 

To test the realism of the rejection manipu- 
lation, we asked “How likely is it that the 
others in the group would prefer that you not 
continue working with them?” There were 
sharp differences between Rejection and Non- 
rejection Ss in their choice of the six alterna- 
tives with respect to this question, Rejection 
Ss perceiving a much greater possibility of 
rejection (p = .001, by chi square). Deviate 
Ss in the Rejection conditions were also more 
likely to feel that their opinion would be the 
basis for rejection than did those in Nonre- 
jection conditions, their responses to the 
question above also being statistically signi- 
ficant (p = .001, chi square). The interaction 
effect between rejection and publicity was also 
significant (p = .05, 2 X 2 X 2 chi square). 


Deviates in the Public Rejection condition 
were most likely to feel that their opinions 
would influence their acceptability; deviates 
in the Private Nonrejection condition were 
least likely to see this relationship. 

The false consensus was also effective in 
giving deviate Ss the impression that they 


held opinions which were sharply discrepant 
from the group, even though their opinions in 
actuality were in line with the majority. The 
responses to the question “How much did 
your first opinion differ from that of the group 
as a whole?” showed sharp differences between 
deviates and modes (p = .001, chi square). 
Deviates were also more likely to associate 
their opinions with rejection than were modes 
(p = .002, chi square). 

In general, our manipulations seemed effec- 
tive, though we could have hoped for more 
clear evidence to this effect with respect to 
the Public-Private variation. 


HYPOTHESES AND RESULTS 


We had thus organized a number of co- 
hesive groups, each with a clearly defined 
group norm from which a number of the Ss 
felt that they deviated. There was pressure on 
the group to achieve uniformity. Ss wrote 
individual descriptions of a case study which 
in some groups were to be made public, in 
others were to remain private. In all groups, 
however, the individual’s opinions were kept 
private. In some groups, the Ss were convinced 
that deviates might be rejected for noncon- 
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formity; in others no such possibility of re- 
jection existed. The experiment was designed 
to test two sets of hypotheses, some with re- 
spect to pressures on opinion, others dealing 
with pressures on content: 


Group Pressures on Opinion 


Given a situation such that (a) an individ- 
ual is a member of a group toward which 
he is attracted, (6) he deviates from a 
well-defined norm of that group, (c) the 
individual has no intense involvement in 
his initial opinion, (d) there is pressure on 
the group to achieve uniformity with re- 
spect to that opinion, (e) the individual 
does not perceive that he can readily 
achieve uniformity by influencing others: 


Hypothesis 1: There will be pressures on the 
individual to change his opinion toward the 
group norm. 

This hypothesis is consistent with findings 
by Asch (2), Sherif (34), Festinger (13, 15), 
and others (4, 8, 16, 19, 20, 26). We note, 
however, that this pressure to change is pre- 
dicted even though the opinion is to remain 
private. Specifically, we predict that the 
deviates would tend to change their opinions 
more than the modes, such changes being in 
the direction of the supposed group norm. 
Table 2 shows that this hypothesis was clearly 
supported, as evidenced in comparison of 
initial and final statements of opinion. 
Whereas modes tended to shift seldom, and 
then in either direction, a very large per- 
centage of the deviates changed toward the 
group norm. 


Hypothesis 2: The more the individual must 
communicate regarding the object of opinion, 
the greater will be the pressure to change his 
opinion toward the group norm. 

This would occur even if the opinion itself 
were to remain private. The communication 
will of itself make the individual more aware 
of his discrepancy from the group norm, and 
emphasize the importance of conformity. In 
addition, we suspect that the pressures which 
affect communicated content operate upon 
perceived content, eventually effecting change 
in opinion. We shall discuss this further be- 
low. As we can see in Table 3, the Ss in the 
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TABLE 2 
NUMBERS OF MopEs AND DEVIATES WHO CHANGED 
THEIR OPINIONS*: > 


Changed 
toward 
Position 7 


Changed 
toward 
Position 2 


Did not 
Change 


Initial 
Position 


,or4 12 46 | 12 
7 89 | 168 16 


Mode 2, 3 
4 6, or 


Deviate 

* Supposed group norm was at Positions 2 and 3. Comparison 
is between initial and final statement of opinion. 

> Deviates, at Positions 5, 6, and 7, were much more likely to 
change toward the opposite extreme than were the modes, who 
supposedly had group support—89 of the 343 deviates made such a 
change, compared to 12 of the 70 modes. This difference in pro- 
portions is significant at well below the .001 level of confidence. 
Comparable figures on the basis of the second statement of opinion 
were 85 for deviates and 8 for modes, of the same total 


TABLE 3 
DEVIATES WHO CHANGED TOWARD 
THE Group Norm®* 


PROPORTION OF 


Public 
Rejection 41 .30 .35 
Nonrejection .36 17 .29 
Combined .39 .26 


Private Combined 


® These figures ave based upon comparison between first and 
final opinion statement. Comparable proportions for change at the 
second statement are: Public Rejection, .36; Public Nonrejection, 
4; Private Rejection, .29; Private Nonrejection, .23. These pro- 
portions are not significantly different from those shown above. 


Public conditions were more likely to change 
than those in the Private conditions, the com- 
bined proportions being significantly different 
at the .02 point (¢ test of difference in propor- 
tions). This difference was especially marked 
among the Nonrejection Ss (p = .03). The 
lesser difference among Rejection Ss may be 
due to the fact that, with 41% of the Ss 
changing in the Public Rejection condition, 
we had reached some upper limit with respect 
to willingness to accept group influence. The 
hypothesis can be said to be substantially 
supported. 


Hypothesis 3: The greater the possibility of 
rejection for nonconformity, the greater the 
pressure to change toward the group norm. 


If fear of disapproval can serve as a source 
of pressure toward uniformity, possibility of 
overt rejection should accentuate such pres- 
sure. Again, the prediction was made for de- 
viates even though they need not publicly 
state their opinions. Dittes and Kelley (9) 
have since found additional evidence to sup- 
port such an hypothesis. 
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Though the differences were in the pre- 
dicted direction, rejection alone seemed to 
have no significant effect upon change in 
opinion, the combined proportions, comparing 
Rejection and Nonrejection deviates, being 
different at the .13 level of significance (¢ test). 
However, pressure to change opinion was 
least for deviates who did not fear rejection 
and also had assurance that their descriptions 
would not be made public. The difference be- 
tween the Private Nonrejection condition and 
the other conditions was significant at the .02 
level. Only 17% of the deviates in the Private 
Nonrejection condition changed their opinions 
toward the group norm, compared to 41% for 
the Public Rejection deviates (p = .01, by ¢ 
test). Obviously, the effects of rejec tion must 
be considered in conjunction with publicity. 


Pressure to Change and Opinion Scale Position 


Though no prior hypotheses were formu- 
lated with respect to the relationship between 
change in opinion and the extremity of opin- 
ion, such a relationship did emerge in the 
analysis, and it will assume some importance 
in the discussion that follows. Studies of at- 
titudes have presented evidence that seems to 
relate intensity of an attitude with the ex- 
tremity of that attitude. Cantril (6) plotted 
extremity of attitude toward government 
regulation of business against intensity and 
found a U-shaped curve. Guttman and Such- 
man (21) have adopted intensity as a means 
of establishing the zero point of an attitude. 
A similar relationship was found by Kelley 
and Volkart (25). We might expect that, in 
this experiment, deviates who held Position 5 
in the first opinion census would feel less 
strongly about their opinion than those at 
Position 6 or 7, and should therefore be more 
susceptible to pressures to change. In fact, 
32% of the Ss at the extreme positions 
changed, while 39% of those at Position 5 


TABLE 4 


NoumBER OF DeviIATES CHANGING ONE, Two, THREE, 
AND Four Steps TOWARD THE NORM 


Initial Number of positions changed 
nitia - . . —_— eh... 
Position 1 2 


16 
3 


0 


changed their opinions toward the norm. 
Though these proportions do not differ sig- 
nificantly, one must also consider the fact 
that Ss at Position 6 or 7 had a greater num- 
ber of positions toward which they could 
change. The effects of extremity on change of 
opinion are even more evident in the amount 
of change (Table 4). Of those deviates who did 
change, we find that those at extreme initial 
positions were likely to change only one step, 
while those at Position 5 were just as likely to 
change two and three steps. The difference 
here is significant (p = .002, by chi square). 


Group Pressures on Content 


Hypothesis 4: Group pressures to select and 
distort content regarding the object of opinion 
will be greater with respect to content that is to be 
communicated to the group than on content that 
is perceived by the individual. 


Assuming that fear of disapproval or rejec- 
tion by the group is prominent, the individual 
in his communication would be expected to 
select those items from the content he per- 
ceived that support the group norm, to avoid 
items of content that oppose the group, and 
to distort ambiguous items so that they seem 
to support the group norm. Thus the content 
communicated would indicate additional selec- 
tion and distortion over that perceived. This 
is, again, consistent with studies by Schanck 
(33) and others who have noted differences 
between publicly expressed opinions and those 
which are privately held, but we go even fur- 
ther and suggest that even though the opinion 
itself may remain private, communication re- 
garding the object of opinion will be distorted. 


Hypothesis 5: Group pressure to select and 
distort communicated and perceived content so 
as to support the group norm will be greater 
when there is possibility of rejection for non- 
conformity. 


Under the conditions outlined, more overt 
pressure from the group might influence both 
communicated and perceived content, even 
though the opinion itself is private. The group 
effects on communication, outlined above, 
should be increased when possibility of rejection 
is more salient. Furthermore, we expected this 
increased effect on the individual’s perceptions 
as well. Let us assume that the individual ex- 
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amines the object of opinion with a view to- 
ward communicating acceptable items to his 
group. Eventually, this need for acceptance 
and approval would result in selective and 
distorted perceptions, just as other needs have 
been shown to influence perception (5, 28). 
Accordingly, increased group pressure would 
influence both perceived and communicated 
content. 

From Hypothesis 4 we should predict that 
deviates in the Public conditions should show 
more selection and distortion of content toward 
the group norm (less positive CIs) than those 
in the Private conditions. From Hypothesis 5, 
deviates in the Rejection conditions should 
have less positive CIs than those in the Non- 
rejection conditions. CIs should be most 
positive in the Private Nonrejection condition 
and least positive in the Public Rejection con- 
dition. Even for those Ss who did not change 
their opinions, group pressures should affect 
communication and perception. 

The first test of this hypothesis comes then 
from an examination of ‘the CIs for those 
deviates who did not change their opinions 
toward the group norm (Table 5). It is evi- 
dent that the hypotheses were not supported 
for these Ss. The Public Nonrejection condi- 
tion showed a less positive CI than the Private 
Nonrejection condition (p = .02 by ¢ test) but 
even this difference is questionable since the 
Public Nonrejection CIs were also more 
variable (p = .01, by F test). 

There is some question whether the deviates 
who did not change their opinions provide an 
adequate test of the hypotheses. As the result 
of differing pressures on opinions, the opinion 
distributions of nonchanging deviates at the 
time of the writing of the descriptions differed 
for the various experimental conditions. We 
might therefore get a more adequate test from 
those Ss who held the same position at the 
time that they wrote descriptions. Thus, we 
see in Table 6 the CIs of all deviates who held 
Position 5 according to their final opinion 
statement. By our previous analysis, these Ss 
should be relatively less involved in their own 
opinion than those who held Position 6 and 7. 
The differences here are all in the predicted 
direction. The difference between the Public 
and Private deviates is significant at the .05 
level (¢ test), supporting Hypothesis 4. This 
is particularly true for Ss who did not fear 
rejection (p = .03, by ¢ test), but less true for 


TABLE 5 


MEAN COEFFICIENTS OF IMBALANCE FOR DEVIATES 
WHO DID NOT CHANGE THEIR OPINIONS*: » 


Public Private 


Rejection 8.14 (38) 8.69 (42) 
Nonrejection .69 (36) 9.21 (36) 


* Difference between Public Nonrejection and Private Non- 
rejection significant at .02 by ¢ test, but Public Nonrejecticn var- 
iance was also greater (p = .01, F test). Other differences were not 
significant 

> The figures in parentheses represent the number of Ss in each 
category. A more positive CI represents a description containing a 
greater amount of content supporting the “‘deviate’’ end of the 
opinion scale. 


TABLE 6 


MEAN COEFFICIENTS OF IMBALANCE FOR DEVIATES 
WHO HELD Position 5 aT FINAL OPINION 
STATEMENT*: > 


Public Private Combined 


(31) | 2.89 (59) 


Rejection .85 (28) 


4.73 
Nonrejection 1.42 (28) |12.30 (17) | 5.53 (45) 


Combined 1.13 (56) | 7.42 (48) 


® The mean coefficient of imbalance of the Private Nonrejection 
condition is significantly greater than that of the Public Rejection 
condition and the Public Nonrejection condition (in both cases, 
p = .03 by ¢ test). The difference between the combined Public 
and combined Private conditions is significant at the .05 level. No 
other differences are significant at the .05 level. 

> The figures in parentheses represent the number of Ss in each 
category. A more positive CI represents a description containing 
a greater amount of content supporting the “deviate’’ end of the 


opinion scale 


those in the Rejection condition. Again, we 
might surmise that some upper limit had been 
reached by Ss in the Public Rejection con- 
dition, reducing the difference between that 
condition and others. Thus Hypothesis 4 
is supported for those Ss who do not have 
strong involvement in their opinion, and 
choose a less extreme position. The differ- 
ences for Ss who held Positions 6 and 7 at 
the final census were not significant. 

The differences between the Rejection and 
Nonrejection deviates at Position 5 were not 
significant (p = .24). The combined effect of 
publicity and possibility of rejection is clear, 
however, the difference between the Public 
Rejection and Private Nonrejection CIs being 
significant at the .03 level by ¢ test (Table 6). 
Thus the data on pressure to distort content 
parallels that on pressure to change opinion. 
Rejection, in and of itself, does not seem to 
significantly affect distortion of content. How- 
ever, deviates who are assured of safety from 
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rejection, and of privacy in their descriptions 
of the case study, experience least pressure to 
distort content toward the group norm, and 
are significantly more likely to write content 
supporting their own opinions. 


Content and Opinion Change 

Implicit in much of the foregoing discussion 
is the assumption that group pressures might 
influence opinion by first affecting the com- 
municated and perceived content that bears 
on it. After the perceived content has been 
influenced, the individual will alter his opin- 
ion so as to make it consonant with the object 
as he now sees it. There are some data to sup- 
port such an assumption. Seventeen Ss in all 
changed their opinions toward the group norm 
between their second and third statements 
that is, during or shortly after the writing of 
their individual descriptions. We compared the 
mean CIs of these people with those of Ss at 
the same initial positions who had not changed 
their opinions. As indicated in Table 7, 15 of 
these 17 Ss had CIs that were more negative 
(more supportive of the group norm) than 
those of Ss at equivalent positions who had 
not changed. The probability of obtaining 
that large a proportion by chance is signifi- 
cant at the .002 level (binomial expansion). 
It seems evident that there was in fact a ten- 
dency for these Ss to bring their opinions in 
line with the content that they perceived or 
communicated. 

TABLE 7 

oF DeviaTes CHANGING THEIR OPINIONS 
WriTInG Description, SHOWING 


TENDENCY TO MAKE OPINION 
CONSONANT WITE CONTENT 


NUMBER 
WHILE 


Frequency of CIs of Changing Deviates, Rela- 
tive to Mean CIs of Nonchanging Deviates*: 


Total 


Position 
Before 


Changing More negative More positive 


5 1 2 13 
6 d 0 4 
Total 1: 2 17 


* As in previous tables, a more negative CI contains a greater 
amount of content supporting the supposed group norm; a descrip 
tion with a more positive CI contains more content supporting an 
opinion at the “‘deviate’’ end of the opinion scale 

>» Comparisons are based on changes between the second and 
third opinion statement. Mean CIs for Ss at Position 5 who did not 


change while writing their descriptions was 4.27; 11 of the 13 Ss who 
hanged while writing had CIs which were less than 4.27. Non 
changing Ss at Position 6 had an average CI of 14.52; all 4 deviates 
between the second and third statements 


The probability of 


who changed opinions 
5 of 17 deviates who 
No Ss at Position 7 


had less positive CIs 
changed having more negative CIs is .002 


changed between the second and third statement 
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DISCUSSION 


One of the original purposes of this study 
was to distinguish experimentally between 
pressures to change opinions and pressures to 
distort content related to the object of opin- 
ion. It is clear that group pressures can 
operate at both levels. The hypotheses re- 
garding distortion received less strong support 
than those about group pressures on opinion, 
perhaps in part because the measurement of 
opinion change is more direct than measure- 
ment of content. Also the forces affecting 
content related to an opinion would seem 
more diffuse than those that affect opinions. 
There are indications, however, that the 
effects of group influence on opinions are more 
subtle than generally supposed. 

The evidence from this study strongly sug- 
gests that opinions can be influenced through 
first influencing content. This is the position 
that was taken by Asch, Duncker, and others 
(1, 2, 10). However, the theory posed here is 
that such influence may not occur merely 
through the acquisition of additional informa- 
tion. If an individual finds himself in a situa- 
tion such that he is attracted toward a group, 
fears rejection from that group—either overt 
or through ridicule—and finds that his opinion 
is sharply different from that of the group, it 
seems that he might first inhibit his expression 
of specific content that is not supportive of 
the group norm and might prove disturbing 
to the group, or distort his communication of 
ambiguous material so that it supports the 
group norm. He next may begin to look for 
content that he can communicate—the in- 
fluence thus being on his perceptions of the 
content related to group opinion. Studies of 
the “influence of needs or perception” (5) 
suggest that such influence need not be con- 
scious. Finally, the individual actually is 
aware of more content supporting the group 
norm than supporting his original opinion, 
and, in this sense, he has an “objective” basis 
for changing; the total data available to him 
support the group opinion rather than his 
original position. In effect, he changes his 
opinion so as to make it consonant with his 


phenomenological data. Such a theory may 
help to explain the situation that occurs when 
white Northern students in a Southern univer- 
sity gradually adopt the racial attitudes of 
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their classmates. By the very fact that he is 
attending a Southern university, the student 
indicates some attraction toward his fellow 
students and a desire for acceptance by them. 
He is initially aware of the discrepancy in 
attitudes towards Negroes, and may inhibit 
the expression of information available to him 
that would place Negroes in a favorable light. 
He next begins to look for things that he can 
communicate, and will remember items that 
tend to support the Southern white point of 
view, distorting or overlooking items that he 
cannot comfortably communicate. We might 
expect that an informational questionnaire 
about Negroes would show that the Northern 
student has accentuated those characteristics 
that support the Southern white stereotype 
well before he actually adopts the Southern 
attitude. Finally, we might find that the 
Northern student has accepted the attitude 
itself, bringing his attitude into consonance 
with his perceptions. This is, of course, only 
one path that pressures toward uniformity 
might take. Nor can we claim support for this 
theory merely on the basis of evidence in this 
study. Additional investigation would seem 
well in order. 


SUMMARY 


In this study, we attempted to determine 
the effects of group pressures on opinion, where 
opinion was to remain private. We were also 
interested in group influences on content re- 
lated to the opinion, both as perceived and as 
communicated. We wished to determine how 
these pressures would be increased where there 
was possibility of rejection for nonconformity. 

Ss were asked to read a delinquency case 
study and to indicate privately an opinion 
position on a scale of personal vs. environmen- 
tal responsibility. A false consensus gave most 
Ss the impression that they were deviates from 
a clearly defined group norm. The opinions 
were to remain private, though later the group 
would discuss the case and reach agreement 
on a report of the case. A later statement of 
opinion confirmed the hypothesis that Ss 
would alter their opinion to conform to the 
group norm even though that opinion was to 
remain private. 

Ss were then asked to write individual de- 
scriptions of the case study. In some groups, 
these descriptions were to be passed around to 
all other Ss. In other groups the descriptions 


were to remain private. Also, some groups 
were told that they might reject some mem- 
bers, while other groups were not offered this 
possibility. 

It was hypothesized that deviates would be 
more likely to change their opinions toward 
the norm when their descriptions were to be 
seen by others; deviates whose descriptions 
were to remain private would be less likely to 
change opinions. The hypothesis was con- 
firmed, even though the opinions themselves 
were to remain private under all conditions. 
It was also predicted that Ss would be more 
likely to change their opinions when there was 
possibility of rejection. This hypothesis was 
confirmed with qualifications. Possibility of 
rejection, in and of itself, did not increase 
change. However, Ss who did not fear rejec- 
tion and whose descriptions were not to be 
made public conformed significantly less than 
the others. 

It was further predicted that group pres- 
sures would operate on the content of the 
descriptions: the tendency to distort content 
so as to support the group norm would be 
greater for public descriptions than for private 
descriptions, and greater for those Ss who 
faced the possibility of rejection than for 
those who would not be rejected. These hy- 
potheses were unconfirmed for the deviates as 
a whole, but differences were more clear for 
the Ss who held less extreme positions and 
were therefore less involved in their own prior 
opinions. For these less extreme Ss, the public 
descriptions were indeed significantly more 
supportive of the group norm than the private 
descriptions. The effects of possibility of re- 
jection on distortion of content paralleled 
those upon change in opinion. The greatest 
distortion occurs when there is both possi- 
bility of rejection and where the statement 
will be made public. Least distortion occurs 
when there is neither possibility of rejection 
nor publicity. 

There was also evidence that changes in 
opinion occurred during and shortly after the 
writing of the group descriptions. Such 
changes tended to bring opinions in line with 
the content of the descriptions—the group 
influence here operated first on the descrip- 
tions and then on the opinion itself. On the 
basis of this and other evidence, a theory of 
social influences on opinions was suggested: 
1. Pressure to change may first operate on 
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content communicated, the individual selec- 
ting and distorting the content he communi- 
cates so as to avoid rejection from the group. 
2. The deviate then begins to perceive selec- 
tively and distort items of content, so that he 
perceives an ever increasing body of content 
supporting the group norm, and fewer items 
supporting his own initial position. 3. Finally, 
the deviate sees a greater amount of content 
supporting the group norm than his own, and 
then changes his opinion toward conformity, 
thus bringing it in line with the phenomeno- 
logical evidence. 
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CRITIQUE AND NOTES 


THE TEST ANXIETY QUESTIONNAIRE: SCORING 
NORMS FOR A NON-COLLEGE POPULATION! 


ZANWIL SPERBER 


University of California, Los Angeles 


ARASON, Gordon, and Mandler describe a Test 

Anxiety Questionnaire (TAQ) designed to 
measure anxiety specific to testing situations (2, 
3, 4). The format (4, p. 810) and scoring method 
(3) of the TAQ require computation of the median 
response of a group of Ss to each of 35 questions 
before a given S’s responses can be scored. Most of 
the time and effort in scoring the TAQ is expended 
in computing the medians. A short-cut scoring 
method would encourage more extensive use of a 
measure which has demonstrated its value as a re- 
search tool. 

Sarason and Gordon present data showing the 
stability of medians obtained from two separate 
samples of Yale students (3) and suggest that these 
published medians might serve as norms for scoring 
directly from an S’s protocol. However, they also 
caution that until it is demonstrated that other 
groups have medians similar to those of Yale stu- 
dents, the TAQ should be scored the long way with 
“‘local”’ norms. 

In the present paper, TAQ data obtained from 
a noncollege sample of U. S. Air Force recruits are 
compared with those from Yale students. One can 
then begin to evaluate the general usefulness of 
Sarason and Gordon’s suggested short-cut scoring 
pre cedure 


YALE AND Arr Force Susyects CoMPARED 


The TAQ was administered to a sample of 294 
male recruits in December, 1953, as part of a larger 
study of the effects of anxiety and stress on per- 
formance (5). In contrast to the Yale college stu- 
dents, a majority of the recruits did not complete 
high school. They had, on the average, 10.8 years 
of schooling (o = 1.5). As an estimate of intelli- 
gence, each recruit’s stanine score on the Tech- 
nician’s Specialty Index (1) was available. Al- 
though the Ss had been selected to have scores of 
at least 3 on this measure, which yields scores 
through 9, the mean was only 5.4 (o = 1.7). The 
mean age of the recruits was 18.8 years (¢ = 3.3), 
probably not very different from the average age 
of Yale sophomores and juniors. Approximately 
15% of the recruits were Negroes. 

1 The use of USAF recruits as Ss was secured through 
the cooperation of Abraham Carp of the Air Force 
Personnel and Training Research Center. The study, 
however, was not sponsored by the USAF, and the Air 
Force is not responsible for this report. 


It is also possible to compare the social-class 
background of the present sample and the Yale 
groups. The recruits filled out a face sheet which 
included questions about their father’s education 
and occupation. A total of 252 recruits could sup- 
ply information about their fathers’ education. In 
contrast to the fathers of Yale Ss, 70% of whom 
had completed college, only 19 of the recruits’ 
fathers had schooling beyond high school, and only 
four were college graduates. On the average, the 
fathers of the 252 recruits had 8.20 years of educa- 
tion (¢ = 3.79). 

The occupations of Yale fathers were categorized 
in three strata (4). Some 23% were in upper-class 
occupations (executives, brokers, directors of cor- 
porations, etc.); 70% were in upper-middle or 
middle-class occupations (managerial, professional 
men, owners of small firms, civil service employees, 
etc.); and only 7% were in the lowest class of occu- 
pations (manual laborers, secretarial, and minor 
civil service occupations). In contrast to the Yale 
fathers, the occupations of 251 recruits’ fathers 
were distributed as follows: 110 in skilled manual 
labor, 56 in unskilled manual labor, 5 firemen, 22 
farmers, 9 in secretarial occupations, 7 salesmen, 20 
in managerial occupations (including foremen in 
manual labor situations), 7 in their own small 
business, 6 in professional occupations or owners of 
property or larger businesses, and 9 in miscel- 
laneous occupations (artist, ballet teacher, poli- 
tician, a “regular” in military service, etc.). At 
least 85% of these occupations fall in the lowest 
class noted by Sarason and Mandler, with, at 
most, 2% in the upper-class stratum and the re- 
maining 13% in the middle-class category. 

The recruit sample, for which TAQ norms will 
be reported, obviously differed significantly from 
the Yale samples with respect to social class and 
educational background and very probably in level 
of intelligence. 


TAQ Data 


Table 1 presents the median response of the re- 
cruits and the two Yale samples to each of the 35 
scored questions of the TAQ.? In general, the me- 
dians were similar. For 11 questions, the medians 


? The recruit data was originally scored in units of 
1 cm (5, p. 218), but medians were converted to the 1.5 
cm unit utilized by the Yale researchers to make the 
results comparable. 
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TABLE 1 


COMPARISON OF THREE SAMPLES: Points (IN RAW 


Score Units) BetweeEN WuHica THE MEDIAN 
RESPONSE OccURS 
Samples Samples 
Ques- | © , Ques Re- ew , 
tion = a a ition? | cruits, A > 
1953 (N = a o 1953 iT - 7 os 
May | 392) | 359) | e. , | 392) | 389) 
4 4-5 3-4 34 | 23) 4-5 4-5 4-5 
5 5-6 3-4 34 | 24) 4-5 344 2-3 
6) 45 3-4 34 | 25) 4-5 4-5 4-5 
7 4-5 45 4-5 | 26| 3-4 3-4 34 
8 34 2-3 2-3 | 27 | 2-3 1-2 1-2 
9| 34 4-5 45 | 2% 5-6 2-3 34 
10 | 45 2-3 2-3 | 29| 5-6 6-7 6-7 
11 4-5 34 34 | 30; 45 45 4-5 
12 1-2 2-3 2-3 | 31) 45 5-6 4-5 
13 | 01 0-1 0-1 | 32| 56 45 4-5 
14 2-3 2-3 2-3 | 33 | 3-4 4-5 4-5 
15 4-5 344 4 | 34) 34 4-5 4-5 
17 5-6 4-5 45/35) 45 4-5 4-5 
18 4-5 45 45 | 36| 45 6-7 6-7 
19 | 34 4-5 45 | 37| 34 3-4 34 
20 5-6 4-5 45 | 38| 1-2 2-3 1-2 
21 5-6 4-5 45 | 39| 45 34 2-3 
22 +5 4-5 4-5 


* Reproduced from Sarason and Gordon, Table 1 (3, p. 448 
© The scored questions are numbered as they appear on the 
TAQ 
TABLE 2 
COMPARISON OF THREE SAMPLES: PERCENTAGE OF 
Suspyects Recervinc Various TAQ Scores 


Recruit Sample, 
1953 (N = 294) 


Yale, Sample 


Yale Sample, 
1952" (N = 359) 


1951* (N = 392) 
Scores or 


Scores , 
70 included c 


included 


Scores 
included ' 
24.8 
25.9 


24-35> 
0-11 


* Reproduced from Sarason and Gordon’s Table 2 (3, p. 448) 
© The highest score possible is 35; the highest score any recruit 
received was 32 


of all three samples fell in the same place. For 18 
additional questions, there was a deviation of but 
one unit, the recruit sample differing from both 
Yale samples by this amount on 16 of these 18 
questions. For only one of the 6 remaining ques- 
tions, No. 28, was there a deviation of more than 
two units. It would appear that when using the 
TAQ with male Ss, little error will be introduced 
if 29 of the 35 questions are scored directly from 
the published norms. Calculation of local norms 
for the remaining 6 questions (Nos. 5, 10, 24, 28, 
36, and 39) would appear to be advisable. 

With respect to the over-all scores the Ss received 
on the TAQ, the distributions of total scores are 
almost identical for all three samples. The median 
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TABLE 3 


CORRELATIONS BETWEEN ScoRES ON THE TAQ, 
WINNE, AND TAYLOR SCALES 


Low Stress Ss (V = 148) | High Stress Ss (V = 146) 
TAQ | Winne 


Test | TAQ |Winne| Test 


Winne | +.12 Winne | +.22** 
Taylor | +.20*|+.62**| Taylor | +.35**/+.64** 


* Statistically significant; p < .05. 
** Statistically significant; p < .01. 


score reported for both Yale samples falls between 
18 and 19 (3, p. 447), and it falls at the same point 
for the recruit sample. Table 2 presents additional 
data on the distribution of total scores for the 
three samples. It is evident that similar cutoff 
scores would discriminate about the same percent- 
age of Ss in each sample (also see 4, p. 811, 
Fig. 1). 

The relationship between scores on a measure of 
anxiety specific to testing situations and measures 
of general anxiety level is also of interest. Gordon 
and Sarason (2) report a statistically significant 
correlation of +.468 between TAQ scores and 
scores on a measure of general anxiety. The recruit 
sample, in addition to the TAQ, also took the 
MMPI, which contains two measures of manifest 
anxiety. Ss’ scores on the 50-item Taylor Manifest 
Anxiety Scale (6) and on the 22 items of Winne’s 
Scale of Neuroticism (7)* which do not overlap 
with Taylor’s scale were obtained. These measures 
were developed independently, using different 
methods of scale construction. 

Prior to taking either the TAQ or MMPI, the 
recruits had been tested on a battery of cognitive 
tasks, half under low stress and half under high 
stress. The product-moment correlations between 
scores on the three anxiety scales were calculated 
separately for the low and high stress groups. Ta- 
ble 3 presents both sets of correlations. Although 
most of the correlations reach levels of statistical 
significance with the large number of Ss involved, 
the correlations between the TAQ and general 
anxiety measures are low. The TAQ can therefore 
be considered as a measure of a specific anxiety. 
An S who is prone to become highly anxious in 
testing situations may not necessarily be particu- 
larly vulnerable to other kinds of stressful situa- 
tions unrelated to testing. 


SUMMARY 
The TAQ responses of a sample of U. S. Air Force 
recruits, differing from Yale students with respect 


3 Winne’s interpretation of his results indicates that 
his scale can be considered a measure of manifest anx- 
iety (7, p. 120-121). 
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to social class, amount of education, and intelli- 
gence level proved to be very similar to the re- 
sponses of two samples of Yale college students. 
Median responses to 29 of the 35 questions were 
within one scoring unit for all three samples, indi- 
cating that a more efficient direct scoring method 
for these items is feasible. For all three samples, 
the distributions of total scores on the TAQ were 
almost identical. While positively related to scores 
on measures of general anxiety, TAQ scores appear 
to reflect particular sensitivity to the specfic type 
of stress associated with testing. 
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AUTHORITARIANISM, VERBAL ABILITY, AND RESPONSE SET! 


DONALD R. BROWN anp LOIS-ELLIN DATTA? 
Bryn Mawr College 


| Sea that the F scale (1) was measuring 
variables other than authoritarianism, e.g., 
age, socioeconomic status, intelligence, and the 
form in which the items were presented, has 
been reported by Hyman and Sheatsley (5). The 
problem of item form as a source of artifacts has 
been investigated by a number of authors whose 
work has been reviewed in detail by Christie et al. 
(3), who also report data gathered with a care- 
fully devised scale of reversed F items. The pres- 
ent study was designed to clarify the meaning of 
observed changes in F scores during four years 
of a liberal arts college experience as these changes 
relate to possible artifacts arising from such 
factors as verbal ability, educational level, and 
response set. 

Existing evidence clearly suggests that some of 
the variance on the original F scale can be at- 
tributed to such artifacts, particularly in groups 
that are not extreme on F. However, it was our 
thought that a more subtle instrument, derived 
empirically with the F scale as a criterion, might 


1 The data reported here were collected at Vassar 
College as part of the research program of the Mary 
Conover Mellon Foundation. The authors wish to 
express their appreciation to Nevitt Sanford, Harold 
Webster, Robert S. Davidon, and Mary Mehl for 
their assistance in gathering and interpreting the data. 

2 Now with Courtney & Co., Philadelphia, Pa. 


not be so open to this type of criticism. An in- 
strument of this kind, the F, scale, has been de- 
veloped by Webster et al. (8, 9, 11). In addition, 
a scale for measuring response set (9), and a 
method for removing the variance of such a re- 
sponse set scale (10), have been reported by 
Webster and are described in the next sections. 
We have also hypothesized, in line with the find- 
ings of Brown and Bystryn (2) and of Sanford 
(6), that education effects real changes in per- 
sonal-cognitive modes of adaptation, and that 
these are reflected in F, scores and are not mere 
artifacts of verbal ability or sophistication in a 
purely genteel sense. 


METHOD 
Variables 


In our analysis we explored the interactions of 
authoritarianism as measured by the F, scale, 
verbal ability (SAT-V) as measured by the verbal 
aptitude score on the scholastic aptitude section 
of the College Entrance Examination Boards 
taken during the last year of high school, the 
educational level of the Ss as measured by year 
in college, and response set as measured by the 
S’s score on a scale developed, according to the 
method of Fricke (4, 10), from a pool of 677 items 
of which F, is a part. 
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TABLE 1 
Upper AND Lower Limits ror THE Upper, MIDDLE, 
AND Lower Turrps oF THE SAT-V AND Rz SCALE 
DISTRIBUTIONS FOR FRESHMEN AND SENIORS 
IN THE SAMPLE OF 512 SuByEcTs 


Re SAT-V 
Group — 
Lower | Middle| Upper | Lower | Middle | Upper 
Freshmen | 6-33 449 50-84 | 430-544 | 545-578 | 579-720 
Seniors 6-34 35-48 | 49-87 | 390-537 | 538-595 | 596-750 
TABLE 2 


F, Score MEANS For 135 CoLLEGE FRESHMEN AND 
135 Seniors Divipep nto THIRDS ON THE BAsiIs 


or THerr R, anp SAT-V Scores 
SAT-V 
Upper Middle Lower Mear 
Re 
Upper Fr. 64.5 68.7 86.7 73.3 
Sr. 56.4 60.5 68.9 61.9 
Middle Fr. 55.7 61.7 68.3 61.9 
Sr. 50.8 51.9 61.3 54.7 
Lower Fr. 47.7 54.2 58.5 53.5 
Sr. 35.9 45.9 48.1 5.3 


Fr. 56.0 61.5 71.2 
Sr. 47.7 52.8 59.4 


Mean 


Briefly, this scale of response set was based 
on the assumption that true-false items which 
are responded to in a 50/50 split are more likely 
than other items to have an ambiguous content. 
Ss who score at the extremes on a scale con- 
structed of such items should therefore be dis- 
tinguished by response in terms of some internal 
tendency to answer positively to ambiguous state- 
ments. The present response set scale consists of 
105 items all scored “true” with means ranging 
from .32-.54 (Type II Error B <.01). Its KR 21 
reliability was .91 for 512 Ss. 


Subjects 


From our original sample of over 1500 fresh- 
men and seniors tested at a women’s liberal arts 
college, a random sample of 512 students (256 
freshmen and 256 seniors) was drawn for experi- 
mental study by the staff of the Mellon Founda- 
tion. From the scores of this sample on SAT-V 
and Re, four grouped frequency distributions 
were made in order to choose, for a comparison 
in terms of F scores, Ss who fell in the upper, 
middle, and lower thirds of the SAT-V and Rs» 
distributions. Table 1 describes these distribu- 
tions for the sample of 512. From this stratified 
sample, 15 cases were randomly selected from 


TABLE 3 
Correctep F, Score Means For 135 COLLEGE 
FRESHMEN AND 135 Seniors DIvipED INTO 


Tuirps ON THE Basis OF THEIR R: AND 
SAT-V Scores 


SAT-V 

Upper Middle Lower Mean 
R: 

Upper Fr. 33.3 36.8 51.3 40.5 
Se. 25.7 29.7 37.3 30.9 
Middle Fr. 33.6 39.3 46.5 39.8 
Sr. 29.3 ee 8 ee 
Lower Fr. 34.9 42.7 45.9 41.2 
Sr. 22.7 32.8 36.3 30.6 
Mean Fr. 33.9 39.6 47.9 40.5 
Sr. 25.9 hae » ae 


each of the 18 possible combinations of variables 
we wished to study by an analysis of variance 
design (for example, 15 Ss who were seniors in 
the upper third of the senior SAT-V distribution 
and the upper third of the Rg distribution etc.) 
The groups were thus equalized in terms of size 
in order to maximize the reliability of the esti- 
mated interaction analysis. 


RESULTS 


The 270 F, scores were then corrected for re- 
sponse set by the method proposed by Webster 
(10). Briefly, this method consists of obtaining a 
score Y uncorrelated with a suppressor variable ¢ 
by weighting the variable ¢ by the regression 
coefficient k of the variable ¢ on T and subtract- 
ing the result from the original score T. Thus 
each score on F, can be corrected for Re by the 
formula Y = Fy, — (k)Re where k = the linear 
regression of Re on Fy. 

Tables 2 and 3 give the resulting distribution 
of F, and corrected Fy, scores with means for 
freshmen and seniors in the sample. It can be 
seen that seniors score lower than freshmen on F, 
and on corrected F,; that F, varies positively with 
Re and inversely with SAT-V. 

The data of Table 2 were subjected to two 
analyses of variance; one for F, and one for cor- 
rected F, scores. Table 4 gives the results for F, 
scores and Table 5 those for corrected F, scores. 
In Table 4 for the F, scores the F ratio of the Ro 
and the within groups mean square* was 48.36; 
for the SAT-V, 35.22; for year in college, 24.18. 


* Since Bartlett’s test for homogeneity of variance 
when applied to the F, categories for the sample of 270 
yielded a x? of 6.4531 which is not significant, it 
seemed safe to attribute significant F ratios to dif- 
ferences in subgroups means 
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TABLE 4 
ANALYSIS OF VARIANCES FOR A SAMPLE OF 270 Srv- 
DENTS WITH YEAR IN COLLEGE, SAT-V, AND R, 
SCORES AS THE VARIABLES WHOSE EFFECTS 
on Fy, ARE TO BE EVALUATED 


, S |} 

Source 7 af Pd 
Response set 16635 | 2 | 8318 | 48.36 
Year in college 6059} 1 | 6059 | 35.22> 
SAT-V 8318 2 | 4159 | 24.188 
R: X SAT-V 814 | 4 204 1.18¢ 
R: X Year 194} 2 97 — 
SAT-V X Year 36 | 2 18 — 
Year X SAT-V X R,z 386 + 97 _- 
(Between groups) (32442)| (17) 

Within groups 43292 | 252 172 
Total 75734 | 269 


*PF 001 = 7.31 with df gms 2 and sms 120 
bP .001 = 11.38 with df gms 1 and sms 120 
©F .05 = 2.45 with df gms 4 and sms 120 


TABLE 5 


ANALYSIS OF VARIANCE FOR A SAMPLE OF 270 Srtv- 
DENTS WITH YEAR IN COLLEGE, SAT-V, AND Rz 
SCORES AS THE VARIABLES WHOSE EFFECTS 
oN CorRECTED F, Scores ARE TO BE 


EVALUATED 
Sum M 
Squares df Squaee F 
Response set 16 2 8 | 22.2% 
SAT-V 7241 2 | 3620.5 
Year in college 5549 1 | 5549 24.17” 
R: X SAT-V 57¢ 4 142.5 
Re X Year 141 2 70.5 
Year X SAT-V 88 2 44 
Year X SAT-V X 252 4 63 
Re 
(Between groups) (13857)| (17) 
Within groups 40922 | 252 162.4 
Total 54779 | 269 
*F 001 = 7.31 with df gms 2 and sms 120. 


>F .001 = 11.38 with df gms 1 and sms 120. 


The above are all significant at better than the 
.001 level. No other interactions were significant. 
Inspection of Table 5 shows that the relation of 
Re to corrected F, disappears as is to be expected, 
since it is treated as a suppressor. Verbal ability 
(SAT-V) and year in college remain significantly 
related to corrected F, at the .001 level. Ail other 
interactions fail to achieve significant F ratios. 


DISCUSSION 


The fact that verbal ability and year in college, 
both high correlates of intelligence, remain sig- 
nificantly related to Fy, even with response set 


removed, but are not related to response set in 
the same degree, supports the contention that 
both of these variables and authoritarianism are 
related in a more basic way than as a simple re- 
flection of acquiescence. Christie’s (3) work with 
reversed items of the original F scale further 
supports this contention. 

It would appear, then, that neither verbal 
ability nor year in college accounts for high au- 
thoritarianism scores through the lack of dis- 
cernment on the part of the S in the face of 
verbal material. Indeed, the striking fact of a lack 
of significant relationship between response set 
anc year in college in our findings indicates that 
response set measures a basic disposition to struc- 
ture ambiguity in a creative process which per- 
mits agreement rather than rejection by the S. 
Perhaps this is another reflection of the general 
superficial tolerance in the face of controversy so 
often attributed to the present generation of 
students. The decrease of F, with year in college 
can then be explained in line with the similar 
findings on development in college or a real in- 
crease in liberal and intellectual tendencies in- 
dependently of simple verbal facilitation, 
sophistication or poise, and underlying disposi- 
tion to be agreeable. In less well educated groups, 
on the other hand, response set may merely be a 
way of defending against lack of comprehension. 


SUMMARY 


Two hundred and seventy college women, 
half freshmen and half seniors, were administered 
the F, and Rez scales as a part of a larger test bat- 
tery. The scores were analyzed by an analysis of 
variance design to explore the relationships among 
F, scores corrected for response set, year in col- 
lege, and verbal ability. The criticism of the au- 
thoritarian syndrome on the basis of acquiescence 
is challenged by the results. 
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INDIVIDUAL VERSUS GROUP GOAL CONFLICT! ?:# 


EWART E. SMITH 


Fels Group Dynamics Center, University of Delaware 


_— casual observation and research suggest 
that an important determiner of group effi- 
ciency and adaptiveness is how group members re- 
solve conflicts between individual and group goals. 
Deutsch (3) has demonstrated the importance of 
cooperative behavior in effective groups. Mintz (4), 
using an ingenious group task that required coop- 
eration for success, found that increasing individual 
motivation frequently decreased cooperation and 
therefore lowered efficiency. He theorizes that un- 
cooperative behavior, being nonadaptive, does not 
occur in a group unless the cooperative pattern is 
broken by the uncooperative deviate. Once the 
cooperative pattern is disturbed, however, cooper- 
ation is no longer rewarding, and nonadaptive 
competition rapidly develops. 

Mintz appears to postulate that the forces arising 
from the norms of internalized reference groups are 
sufficiently strong, in most individuals, to counter- 
act individual, competitive motivations. The con- 
tention here, however, is that such internalized 


! This report is based on work done under ARDC 
Project No. 7723, Task No. 77461, in support of the 
research and development program of the Air Force 
Personnel and Training Research Center, Lackland 
Air Force Base, Texas. Permission is granted for re- 
production, translation, publication, use and disposal 
in whole or in part by or for the United States Govern- 
ment. The opinions or conclusions expressed or implied 
herein are those of the author. They are not to be con- 
strued as necessarily reflecting the views or endorse- 
ment of the Department of the Air Force or of the Air 
Research and Development Command 

? The writer is indebted to Carl W. Backman, Phil 
W. Buck, Willard F. Day, Robert McQueen, Edwin H. 
Richardson, Paul F. Secord, Walter A. S. Smith, E. 
Paul Torrance, and to Charles E. Hawkins, who served 
as experimenters and gave helpful criticisms and advice 

* This study was conducted while the author was 
with the Survival Methods Branch, Air Force Personnel 
and Training Research Center, Stead Air Force Base. 


group forces are not strong enough to counteract 
the strong individual goal forces that are aroused 
in many real situations, such as Mintz’s example 
of the theatre fire; strong individual goal behavior 
can only be successfully opposed by immediate 
visible external group forces. 

These considerations lead to the following hy- 
potheses: 

1. When faced with an alternative, persons will 
choose the attainment of their own goal in prefer- 
ence to that of a group goal less frequently in an 
overt situation, in which their choice is known by 
the group, than in a covert situation where their 
choice is secret. 

2. When faced with an alternative, persons will 
choose the attainment of their own goal in prefer- 
ence to that of a group goal more frequently 
under increased individual motivation than un- 
der reduced individual motivation. The first 
hypothesis is consistent with the powerful effects 
of groups demonstrated in research on conform- 
ity, such as that by Asch (1). It is inconsist- 
ent, however, with Mintz’s position, which implies 
that in a covert situation most individuals should 
be cooperative, as the infrequent uncooperative act 
would not be perceived by the group and the coop- 
erative pattern should not, therefore, be disturbed. 
The second hypothesis appears to be self evident 
and is consistent with Mintz’s data. 


METHOD 

Subjects 

The subjects (Ss) were 120 male flying personnel, 
including 92 officers and 28 enlisted men, who 
were attending the Survival School at Stead Air 
Force Base. The six-man experimental groups were 
actual Air Force crews. Some crews had been to- 
gether for many months; others consisted of men 
assigned to the same crew for their stay at the 
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Survival School. All crews had been living together 
for a minimum of five days. 


Prov edure 

Two independent variables were used; the first 
was high individualistic motivation, induced by 
an evaluation set, versus low individualistic moti- 
vation. The second independent variable involved 
comparison of an overt situation, in which the 
group members knew when an S failed to sacrifice 
his solution to help the group, with a covert situ- 
ation in which the group members did not know 
when an S failed to sacrifice. There were thus four 
conditions (with five six-man crews in each): (a) 
high individualistic motivation, overt; (6) high in- 
dividualistic motivation, covert; (c) low individual- 
istic motivation, overt; and (d) low individualistic 
motivation, covert. Ten experimenters were used. 

The procedures, instructions, and apparatus‘ 
were copied from Crutchfield’s assessment tech- 
nique, with minor modifications. The reader is re- 
ferred to Crutchfield (2) for a detailed description. 
The Ss were seated in a circle with their backs to 
one another. Each S was to assemble a square with 
some geometric pieces. As the initial pieces held by 
each S did not form a square, the Ss had to request 
and exchange pieces by means of a tray carried 
around the group by the experimenter. The in- 
structions indicated that it was easy for one or two 
people to form a square, but difficult for everyone 
to have a square simultaneously. The Ss were told 
that at the end of the allotted time (unspecified), 
if they each have a square, the group would receive 
30 points (i.e., 5 points each). If they did not each 
have a square, those who did would each receive 5 
points. 

By manipulating the tray, the experimenter was 
able to control the situation. Each S passed 
through four identical rounds. In the first round 
he requested a needed piece; in the second, he re- 
ceived it and completed his square; in the third 
and fourth rounds he held his square unmodified 
since there was no request for a piece held by him. 
The stage was now set for the critical trials, where 
stress was placed upon each S by facing him with 
a request for a piece which was part of his com- 
pleted square. These critical trials were continued 
until each S had given up the requested piece, or 
until there had been sixteen critical trials. 

The basic instructions and procedures were varied 
in the four experimental conditions, in the follow- 
ing manner: 


A. High individualistic motivation conditions ~ 


(both overt and covert): The instructions stated 


‘The only modification in the apparatus was the 
use of the word “request’”’ on the underside of the re- 
quest pieces. 


that the experimenters had been asked to evaluate 
the intellectual aptitude of the Ss and would do so 
by means of a test. They were told that the results 
would be sent to their commanding officers, who 
would discuss the results with them. 

B. Low individualistic motivation conditions 
(both overt and covert): The instructions stated 
that the task was an experimental one, not known 
to measure anything. The Ss were told that it 
would not be necessary to give their names. 

C. Covert conditions (high individualistic moti- 
vation and low individualistic motivation): The 
following sentences were added to the basic in- 
structions: 


We have provided you each with a box to make 
sure that no one sees the solution to the problem by 
looking at all the pieces at once, or even just at those of 
a person to his side. However, since the boxes don’t 
cover up the pieces entirely, you should look straight 
ahead and not to either side. 

On any given presentation you may either siraply 
pass, that is, by shaking your head indicate to the 
messenger that you neither want to exchange nor to 
request a piece, or both exchange and request. 


D. Overt conditions (high individualistic moti- 
vation and low individualistic motivation): In 
these conditions, the basic instructions were modi- 
fied as follows: 


On any given presentation you may either simply say 
“Pass” indicating to the messenger that you neither 
want to exchange nor to request a piece, or you may 
exchange one piece or request one piece, or both ex- 
change and request. 


In addition, when the experimenter presented 
the tray on a critical trial he said, “Someone needs 
one of your pieces.” If the S passed, the experi- 
menter repeated after him, “You pass.” 

An S’s score was the trial on which he gave up 
the requested piece. 


The Design in Summary 

The procedure described was, in summary, in- 
tended to create the following psychological condi- 
tions: 

High individualistic motivation. The individual 
was in danger of losing status, in his own mind and 
that of his commanding officer, because of failure 
on an intellectual task. At the same time, he was 
confronted with requests from others for help. To 
help the others would increase the possibility that 
the individual himself would fail. 

Low individualistic motivation. There was little 
extrinsic motivation for attempting to do well. The 
results would be anonymous, and the task was not 
known to measure anything. However, we may as- 
sume some minimal motivation to do well. Again 
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TABLE 1 
NuMBER oF Susyects Wo Sacriricep on EAaca 
CriticaL Tria in Four ExperRmMeNnTAL 
CONDITIONS 


Low 
Individua- 
listic Mo- 
tivation 

Covert 


Low In 
dividual- 
istic Mo 
tivation 
Overt 


High 
Individ- 
ualistic 
Motiva- 
tion Overt 


High In- 
dividual- 
istic Mo- 
tivation 
Covert 


Critical 
Trials 


1-4 20 
5-8 2 
9-12 0 
13-16 0 
Never s 
Sacri- 
ficed 


Note 30 in each condition 


TABLE 2 


or Susyects Wo SACRIFICED 
EXPERIMENTAL CONDITIONS 


NUMBER IN Four 


Overt Covert Totals 

High individualistic mo 8 30 
tivation 

Low individualistic mo 7 18 45 
tivation 


Totals 49 26 


Note.—N = 30 in each condition 


the individuals were in conflict between their moti- 
vations to do well and their desire to help their 
associates 

Overt conditions. The individual knew that his 
decision, as to how he resolved his conflict, would 
be known by the group 

Covert conditions. The individual knew that the 
group would not know how he resolved his conflict. 
The covert situation was only relatively covert, 
however, as the experimenter obviously knew how 
the S resolved it. The Ss appeared to feel under 
pressure from the experimenter to help their group 
as indicated in the Ss who had not sacrificed by 
such behavior as ignoring the presence of the ex- 
perimenter and the tray on subsequent critical 
trials, avoidance of eye contact with the experi- 
menter, etc. 

Observer reports and interviews with Ss in- 
dicated that the experimental manipulations pro- 
duced the desired psychological conditions. 


RESULTS 


The data in Table 1 indicate when the Ss, in each 
condition, broke their squares, thus jeopardizing 
their individual solution to help their group and, 
in effect, temporarily giving up their own goals in 
favor of group goals. As the distributions are 
skewed, the data have been simplified to permit a 
chis juare analysis (see Table 2.) 


The chi square on the number of Ss in all the 


AND NOTES 


covert groups who broke their squares, compared 
to the number in all the overt groups, was 18.81, 
significant at the .01 level. These data support the 
first hypothesis. 

The chi square on the number of Ss breaking 
their squares in all high individualistic motivation 
groups, compared to the number in all low indivi- 
dualistic motivation groups, was 8.00, significant 
at the .01 level. These results appear to support the 
second hypothesis. However, inspection of the data 
indicates a possible interaction effect between the 
variables of overt-covert and individualistic moti- 
vation. Interaction was tested by comparing the 
Ss in the high individualistic motivation-covert 
groups with the Ss in the low individualistic moti- 
vation-covert groups. The resulting chi square was 
6.79, significant at the .01 level. In addition, the 
high individualistic motivation-overt groups were 
compared with the low individualistic motivation- 
overt groups. The resulting chi square of 1.78, was 
not statistically significant. Apparently, then, there 
was a significant interaction effect between indi- 
vidualistic motivation and covert-overt conditions, 
with the variable of individualistic motivation being 
potent only in the covert groups. 

An attempt was made to relate cohesiveness to 
the readiness of Ss to jeopardize their own solu- 
tions to help their fellow group members. On the 
assumption that the longer crew members had 
been together the more cohesive they would be, the 
number of months each S had served with the other 
members of his crew was correlated with readiness 
to break his square. No significant relationship was 
found.® 

An interesting post hoc finding® is seen in Table 
1, in the striking difference between the overt and 
covert conditions in the number of critical trials 
occurring before Ss broke their squares. In the 
overt condition, all but one of the Ss who broke 
did so in the first half of the 16 critical trials. In the 
covert conditions, 14 broke in the first half of the 
critical trials, and 12 in the last half. A chi square 
comparison of the overt and covert conditions on 
breaking in the first eight trials versus breaking in 
the last eight trials is 23.07, which is highly signif- 
icant. These data suggest that in the overt condi- 
tions, those Ss who refused to break their squares 
in the first few trials had made a public decision 
which was difficult to change without public ad- 


5 Subsequently, data were collected on additional 
crews, using as a measure of cohesiveness the degree of 
their desire to remain in the same crew during the 
arduous survival trek phase of their training. Again 
there was no relationship between cohesiveness and 
willingness to sacrifice one’s own goal for the group 
goal. 

* This analysis, and interpretation, was suggested 
to the writer by John T. Lanzetta. 
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mission of wrong-doing, whereas in the covert con- 
ditions their decisions were not public and pre- 
sumably could be more easily changed. If this 
interpretation is correct, it serves to explain a 
striking phenomena observed by the experimenters. 
Members of the covert groups evidenced high ten- 
sion by signs such as perspiring and avoidance be- 
havior, whereas members of the overt groups were 
relatively calm. We may suppose that in the overt 
groups early decisions ended the conflict, while 
continued freedom to change prolonged the conflict 
in the covert groups, with resultant higher tension 
levels. 
DISCUSSION 

The relative lack of cooperative behavior ob- 
served in the covert groups, compared to the overt 
groups, is contrary to Mintz’s (4) postulate that 
uncooperative group behavior is due to the per- 
ception by the group of an uncooperative act on 
the part of a deviant individual. The Ss in the 
covert groups did not know that some of the others 
were being uncooperative. In the overt groups, 
however, at least one member of each group was 
openly uncooperative on the first trial, yet most 
Ss soon cooperated with the group. 

The ineffectiveness of individualistic motivation 
in the overt conditions is contrary to Mintz’s (4) 
results. Mintz found, in what was in effect an overt 
situation, that the addition of small monetary re- 
wards and punishments produced an increase in 
uncooperative, individual-oriented behavior. This 
inconsistency may be due to the lack of compara- 
bility of the laboratory and field conditions.’ 

In Mintz’s research, causing others to fail could, 
at most, result in their paying a ten-cent fine, 
whereas in the present experiment, failure could 
have far reaching effects on the other crew mem- 
bers’ military careers. In addition, Mintz’s groups 
were ephemeral in contrast to the real crews used 

7 Mintz (4) states that his conclusions are tentative 
until verified by field data. 


HE measurement of meaning, though obvi- 
ously important, is beset with difficulties (3). 
Jones and Thurstone (1) imply that these diff- 


culties may result from failure to restrict the 
semantic context. They state: 


1 Part of this paper was read at the American Psy- 
chological Association meeting in 1957. 
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here who would have to work and live together 
after the test. 

The practical implications of this research are 
clear. When. it is desirable that persons be socially 
rather than individually oriented, the wise course 
is to structure the situation so that most behavior 
is open to inspection by the group. 


SUMMARY 


An experiment was performed in which a con- 
flict was produced between individual and group 
goal attainment. An overt situation was compared 
to a covert situation, and a high individualistic 
motivation condition with a low individualistic 
motivation condition. The hypotheses were: 

1. When faced with an alternative, persons will 
choose the attainment of their own goal in prefer- 
ence to that of a group goal less frequently in an 
overt situation, in which their choice is known by 
the group, than in a covert situation where their 
choice is secret. 

2. When faced with an alternative, persons will 
choose the attainment of their own goal in prefer- 
ence to that of a group goal more frequently under 
increased individualistic motivation than under re- 
duced individualistic motivation. 

The first hypothesis was supported. The second 
hypothesis was found to hold only in covert situa- 
tions. 

A post hoc finding that individuals apparently 
feel freer to change secret decisions than public 
decisions is discussed. 
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SEMANTIC ASPECTS OF PROGNOSIS! 


It is probably quite true that a word has no unique 
meaning or, more properly, that the meaning of a word 
depends upon the context in which it is presented. In 
the latter sense, a word has an infinite number of mean- 
ings each corresponding to a particular context. If such 
is the case, it is not possible to determine, either logi- 
cally, or experimentally, the generalized meaning of a 



































































CRITIQUE 


TABLE 1 
CONCORDANCE WITHIN GROUPS 


Group 


Aide (I) 
Aide (II 


Professional staff 


TABLE 2 
AGREEMENT BETWEEN GROUPS 


Groups p 


Aide (I) vs. Aide (II .001 
Professional vs. Aide (I) .001 
Professional! vs. Aide (IT) : .001 


word. However, it may be possible to present words in 
a particular context and to determine their meaning in 
terms of that imposed context (1, p. 31) 


In their study, Jones and Thurstone presented a 
list of descriptive adjectives on a successive in- 
terval schedule. Subjects were asked to indicate, 
along a nine-point scale, the meaning of each word 
or phrase in terms of the degree to which each de- 
noted like or dislike for food. 

The present study stems from the work of Jones 
and Thurstone in that a psychophysical scaling 
procedure, the rank-order technique, was applied 
to a problem of word meanings in a restricted 
semantic context, that of the communication of 
behavioral descriptions between and within pro- 
fessional and nonprofessional groups in a general 
psychiatric hospital. The essential commonality 
and specific differences in meaning of psychiatric 
symptom terms were assessed among psychiatric 
aides and professional staff members in the spe- 
cific semantic context of prognostication. Predic- 
tions were as follows: 

1. Within the semantic context of prognostica- 
tion, psychiatric symptom terms evoke significant 
commonality of meaning (a) among professional 
staff members (psychiatrists and psychologists), 
(6) among psychiatric aides, and (c) between pro- 
fessional staff members and psychiatric aides. 

2. Prolonged pe rsonal contact between profes- 
sional staff members and psychiatric aides in- 


creases the commonality of meaning of termi- 
nology 

3. In the case of terms on which aides and profes- 
sionals tend to disagree, the aides perceive those 
terms that have high personal threat value for 
them as contributing toward a poorer prognosis 


than do the professional staff members. 


AND NOTES 


METHOD 

Subjects (Ss) for this experiment included 12 pro- 
fessional staff members and 35 psychiatric aides from a 
small state mental hospital. The professional staff were, 
for the most part, recent additions to the staff of the 
hospital, which was then undergoing extensive reorien- 
tation toward education and active treatment pro- 
grams. Consulting relationships between professional 
staff and groups of ward personnel had recently been 
established, thus greatly increasing contact and com 
munication between professionals and aides. 

A list was compiled of 30 psychiatric symptom terms 
and phrases commonly used by both groups of Ss. 
Each item appears on at least one of the ward behavioral 
charts regularly used by aides. Ss were asked to rate 
independently each of the 30 items on a 10-point scale 
ranging from “least serious’’ to “‘most serious”’ in terms 
of the degree to which each contributes toward a 
favorable prognosis. The distribution of rankings was 
forced, each S being compelled to rank three items 
at each of the 10 scale points, thus yielding a modified 
rank-order distribution. After a six-month interval, 
the aides repeated this procedure 


RESULTS AND DISCUSSIONS 


Two professional staff members and eight aides 
had to be dropped from the study because of failure 
to follow instructions on the forced ranking. The 
data reported, therefore, are based on the remain- 
ing 37 Ss. 

Within each set of ratings, frequency tables were 
constructed by tabulating the number of times each 
symptom term was ranked at each scale point. In 
order to treat the data by rank-order technique, 
each of the 10 scale points was regarded as a three- 
way tie in a distribution of 30 items. Chi squares 
of concordance were then computed using the 
formula of Friedman’s “problem of m rankings” 
(2). These results are reported in Table 1. Each 
value is significant at well beyond the .001 level, 
indicating a high degree of rater agreement within 
each set of ratings. These data confirm the first 
prediction with respect to commonality of meaning 
within groups. 

For each set of ratings—first and second adminis- 
trations to aides and single administration to pro- 
fessional staff_—item means for the symptom terms 
were computed and ranked. Agreement between 
groups was measured by rank-order correlations, 
reported in Table 2. All these coefficients are signif- 
icant beyond the .001 level. As predicted, the co- 
efficient between the professional staff members 
and the aides on the administration is 
larger than on the first, but the difference falls 
short of acceptable significance. 

The item means and ranks are presented in Ta- 
ble 3. The rank position of each symptom term 
was compared over the three sets and the three 
possible rank discrepancy scores computed for 


secone 1 
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TABLE 3 
MEAN RANK AND PosiTION FoR EAcu 
Group OF RATERS 
Aide II 


Professionals; Aide I 


Mean Rank |Mean Rank Mean Rank 

Delusions of perse- | 8.80 1 7.63 41 

cution 
Attempted 

cide 
Mute 18 
Apathetic { 47 
Unable to care for 5.2 47 


homi- | § 18 


personal needs 
Hallucinations 47 
Attempted suicide 29 
Delusions of gran 

deur 
Incontinent 
Seclusive 
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Very prayerful 
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Sexually 
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*Items with significant disagreement, judged low personal 


threat 


> Items with significant disagreement, judged high personal 


threat 


each item, with a possible range of —29 to +29 
and an actual range of —13 to +15. All discrep- 
ancies greater than +6.5 fall outside the .05 con- 
fidence interval around the median value. There 
were 27 such scores representing 15 symptom 
terms. These 15 items (indicated on Table 3) rep- 
resent “significant” disagreement between groups 
of judges. 

Each of five clinical psychologists was asked to 
consider these “‘significant’”’ items and check those 
considered to possess high personal threat value 
for the psychiatric aide. Six of the 15 symptom 
terms were regarded as threatening by four 
of the five judges—the arbitrary criterion. The 
interrater tetrachoric coefficient of agreement was 
78. 

\s predicted, the six high personal threat items 
were rated higher by the aides. Seven of the nine 


TABLE 4 


AGREEMENT BETWEEN STUDENTS AND OTHER 
Groups 


Groups 





Students vs. Professionals 
Students vs. Aide (I) 
Students vs. Aide (II) 





TABLE 5 
NuMBER OF ACTIVE AND Passive Items VIEWED AS 
More Serious orn Less SERIOUS BY THE 
Pror 


Active |Passive 


Professional staff rating | 12.5 | 2.5 |96.44 .001 
less serious than non- 
professional 

Professional staff rating 12 
more serious 


remaining items were rated higher by the profes- 
sional staff. With 13 of the 15 items comforming to 
the prediction, the binomial expansion is significant 
beyond the .01 level. These data suggest that the 
meaning that persons attribute to behavioral de- 
scriptions is predictably influenced by the degree 
to which the behavior makes them feel uncomfort- 
able. Behaviors thought by experts to elicit threat 
reactions from psychiatric aides are regarded by 
the aides as contributing toward a poorer prog- 
nosis than would be assigned by professional staff 
members. There is reason to believe that, in a 
therapeutic millieu, a worker’s behavior toward a 
patient will be influenced by his prognostic predic- 
tions. Such phenomena have been demonstrated 
for example in studies of mutual withdrawal (4). 

The question may well be raised as to whether 
the semantic norm reflected in these data pertains 
to a psychiatric hospital or to the general English- 
speaking community. The interpretation of items 
that elicited disagreement in terms of personal 
threat is also open to question: perhaps what dif- 
ferenitiates them from other items is simply that 
they reflect overt large muscle activity. A group of 
nonpsychiatric raters was therefore obtained con- 
sisting of 30 undergraduate students enrolled in 
the first course in psychology. After discussing prog- 
nostication, each S completed the ratings of the 
30 symptom terms. 

Statistical analysis of these data paralleled that 
previously discussed. The chi square of 301.7 in- 
dicates significant concordance within the group. 
Rank-order correlations computed between this set 
of ratings and the initial sets are all significant. 
(See Table 4.) In short, commonality is extended, 
and a general English semantic norm seems indi- 
cated. 
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Rank discrepancies were computed between the 
professionals and the students, and examined in 
terms of the previously established confidence in- 
terval. Without exception, where the professional 
staff ranked an item as more serious than the aides, 
they also ranked it as more serious than the stu- 
dents. The professionals thus appear to diverge from 
the general semantic norm. The aides and college 
students agree with each other to a greater degree 
than either agree with the professional staff. 

To check whether the type of activity implied 
by the symptom term associated with the disagree- 
ments that occurred, the list of 30 items was given 
to four judges with instructions to designate each 
item as active or passive with regard to voluntary 
overt large muscle activity. Applying the criterion 
of three agreements, 15 items were designated as 
active and 15 as passive 

The rank positions of each item as assigned by 
the three nonprofessional groups were averaged and 
these figures compared with the rankings of the 
professionals. These data, reported in Table 5, in- 
dicate a sharp difference between professional and 
nonprofessional raters. Active items were rated 
consistently as more serious by the aides and stu- 
dents, while passive items were so rated by the 
psychologists and psychiatrists. The essential di- 
chotomy reflected view that the 
groups of raters were differentially influenced by 
the type of activity implied by the symptom terms. 


supports the 


SUMMARY 


Commonality of meaning of psychiatric symptom 
terms among psychiatric aides and professional 
staff members was evaluated in the semantic con- 
text of prognostication. Groups of psychiatric aides 
and professional hospital personnel ranked 30 
symptom terms along a 10-point scale in terms of 
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the degree to which each contributes toward fa- 
vorable prognosis. Commonality of meaning was 
demonstrated between and within groups, but 
exceptions occurred as certain symptoms considered 
threatening to the aide were rated as more serious 
by aides. 

Integration of these data with ratings completed 
by college students raised the hypothesis of a gen- 
eral English semantic norm for psychiatric symp- 
tom terms. This hypothesis was supported with all 
measures of commonality achieving significance. 
As indicated by specific differences between groups, 
the highly trained professional workers tend to 
diverge from the general semantic norm. 

This divergence of the professional subjects is 
highly related to the active-passive dimension of 
the behaviors rated. Items implying overt large 
muscle activity were consistently viewed as con- 
tributing toward a poorer prognosis by the nonpro- 
fessional groups of raters. One would wonder to 
these data reflect differences in 
as opposed to differences in atti- 


what 
“knowledge,” 
tudes regarding prognostication, psychopathology, 
and human behavior in general. 


degree 
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TIME JUDGMENT, AESTHETIC PREFERENCE, AND NEED FOR ACHIEVEMENT! 


HELEN B 


GREEN anp ROBERT H. KNAPP 


Wesleyan University 


N THE following report, certain measures of 
time judgment are studied in relation 
to a pattern of aesthetic preferences 
among Scottish tartans which has been es- 


' The research reported in this article was supported 
in part from a grant by the Behavioral Sciences Pro 
gram of the Ford Foundation to David C. McClelland 
for study of the relationship between achievement moti 
vation and economic development 


tablished for McClelland’s measure of n 
Achievement (2). The present study attempts 
to demonstrate that performance in time 
judgments is related to asceticism of aesthetic 
taste, which has already been shown to cor- 
relate with achievement motivation. The 
subjects (Ss) consisted of 29 teachers ranging 
in age from 25 to 45 years who were enrolled 
in summer school at Wesleyan University. 
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They proved quite heterogeneous with respect 
to background and educational interests. 
Each S was given three tests as follows: 

1. The Tube Test. This test was designed to 
measure the S’s capacity to estimate the time 
required for a constantly moving point to 
reach a fixed mark after its rate of progress 
had been briefly observed. For this purpose, 
an apparatus was constructed consisting of a 
glass tube approximately three feet in length 
and two inches in diameter. At the beginning 
of each test all water was evacuated from the 
tube. At a given signal, a pump was started 
causing water to rise in the tube. For the first 
ten inches, the rate of rise in the water column 
was observed by the S, and thereafter the tube 
was masked by a heavy tape. S was required 
to indicate when he believed the water column 
had reached a designated mark. Six trials were 
undertaken by each S, three at each of two 
fixed marks located at 20 and 28 inches respec- 
tively. S was not informed of the accuracy of 
his judgment. The score on this test was the 
average judgment in inches for the six trials. 
A low score indicated that the individual had 
underestimated the time required for the 
column to reach the mark. 

2. The Evenis Test. This was the second test 
involving judgments of time. In this instance, 
six events of recent history were presented 
with the request that S indicate the approxi- 
mate month and year on which each had oc- 
curred. These events were: 


Supreme Court ruling on segregation 
Stalin’s death. 

The Kinsey Report 

The outbreak of the Korean War. 
The McCarthy vs. Army hearings. 
Eisenhower’s operation 


This test was scored by first computing for 
each S the number of menths from the present 
to the estimated date of each event. For every 
event the estimates of all Ss were reduced to 
normalized TJ scores and then averaged for 
each individual. This procedure provided a 
single score for each S such that all event 
estimates were equally weighted regardless 
of the mean and SD of the estimates in terms 
of actual months. A low score indicated that 
the S tended to recall the events as nearer 
the present, and a high score that he estimated 
them as more remote. 


TABLE 1 
CorRELATIONS BETWEEN TuBE Test, Events Test, 
AND TARTAN Test SCORED FOR n ACHIEVEMENT 


Tartan (for n 


Events Achievement) 


Tube + .35 + .44 
Events + .42 
Note.—p = .01 at .46. 

p = .05 at .36. 


3. The Tartan Test. This was the third test 
administered and consisted of 30 lithographic 
reproductions of Scottish tartans approxi- 
mately 3x5 inches mounted on cards 
8x11 inches in size. These reproductions 
were obtained from Robert Bain’s The Clans 
and Tartans of Scotland (1) and varied widely 
with respect to predominant color, fineness of 
texture, complexity of design, degree of con- 
trast, etc. The patterns were presented, 
mounted on a wall, to each S, with the re- 
quest that he select the five most aesthetically 
pleasing and the five least aesthetically pleas- 
ing. 

On the basis of a previous study (2), each 
tartan had been assigned a rank order for its 
correlation with achievement motivation as 
measured by McClelland (3). In the present 
study, each S was given a score consisting of 
the sum of the ranks of his five most preferred 
tartans minus the sum of the ranks of his five 
least preferred tartans. Thus a low or negative 
score indicated that S preferred tartans posi- 
tively correlated with achievement motiva- 
tion. The earlier study had shown that of the 
tartans occupying the first ten ranks, only the 
Ogilvie contained any significant amount of 
red. The rest, namely, the Campbell of Bread- 
albane, Elliot, Anderson, MacDonnell of 
Glengarry, MacPherson Hunting, Cameron 
of Erracht, Clergy, Sutherland Ancient, 
and Oliphant are uniformly somber and most 
show predominant blue and green color. On 
the other hand, the ten tartans yielding the 
largest negative correlations are strikingly 
different. Seven of the ten embody vivid red, 
namely, the Drummond, Hay, Sinclair, Bro- 
die, Stewart, Ramsay, and Stewart of Appin. 
An eighth, the Barclay, contains vivid yellow, 
while of the remaining two, only one, the 
Cummings, has substantial blue or green and 
might be characterized as somber. In short, 
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this ranking of tartans progresses roughly 
from somber, or “ascetic,” tartans to vivid 
“sensual” designs. 

The correlations between the three primary 
measures of this study, the Tube Test, the 
Events Test, and the Tartan Test scored for n 
Achievement are given in Table 1. It will be 
seen that two of these correlations, namely 
those between the Events Test and the Tartan 
Test (scored for n Achievement) and be- 
tween the Tube Test and the Tartan Test 
(scored for n Achievement) stand at the quite 
secure significance level of approximately 
.02. The third correlation is almost at the .05 
level. Thus, there appears to be a fairly reli- 
able pattern relating the tendency to recall 
past events as near the present, the tendency 
to anticipate future events before they occur, 
and a preference for that type of aesthetic 
asceticism known to correlate with high 
achievement motivation. 


DISCUSSION 


In an earlier study devoted to the analysis 
of Tartan preferences in relation to n Achieve- 
ment, it was proposed that one of the prime 
qualities of the individual high in need 
Achievement is the desire to manipulate his 
environment and to view himself as the active 
agent. This consideration might lead us to 
conclude that preference for somber, passive 
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tartans merely represents a projected wish 
that his environment be manipulanda, not 
manipulator. In this present study we wish 
to extend this line of thought in connection 
with attitudes toward time. We have shown 
here that persons who anticipate future con- 
ditions before they arrive also tend to recall 
past events as more recent than they really 
were. Such persons, it appears, wish to create 
in their psychological present a sort of “event 
density.” We speculate that the future is 
already upon them while the past has not yet 
slipped away, and the universe that confronts 
them is therefore teeming with opportunities 
for manipulation and achievement. If this 
interpretation is correct, we may tentatively 
propose a dynamic triad, relating ‘“‘parsi- 
monious” time attitudes, achievement moti- 
vation, and asceticism of aesthetic taste, which 
has historically found its manifestation in 
Puritanism. 
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